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(54) Titie: METHOD FOR DETECTION AND TREATMENT OF BREAST CANCER 
(57) Abstract 



The present invention provides a method of detecting and diagnosing prt-invasive breast cancer by identifying difTerentially expressed 
genes in early, pre-invasive breast cancer tissue. Differentially expressed genes can be used as genetic maricers to indicate die presence 
of pre-invasive cancerous tissues. Microscopically directed tissue sampling techniques combined with differential display or differential 
screening of cDNA libraries arc used to dctciminc differential expression of genes in the early stages of l>rcast cancer. Differential 
expression of genes in pre-invasive breast cancer tissue is confirmed by RT-PCR. nudease protection assays and in-situ hybridization of 
ductal carcinoma in situ tissue RNA and control tissue RNA. The present invention also provides a method of screening for compounds that 
induce expression of the BRCAl gene, whose produrt negatively regulates cell growth in both normal and malignant mammary epithlial 
cells. The present invention also relates to gene therapy method using this gene. 
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DESCRIPTION 

"METHOD FOR DETECnON AND TREATMENT OF BREAST CANCER" 
TECHNICAL FIELD 

The present invention rdates generally to methods of detection and diagnosis of 
breast cancer and more particularly to a diagnostic method which relies on the 
identification of marker genes expressed in pre-invasive cancers by microscopically- 
directed cloning. Furthermore, this invention concerns the prevention, detection, and 
diagnosis of breast cancer by addressing the molecular events which occur during the 
earliest alterations in breast tissue. 

The present invention also relates generally to methods of treatmrat of breast 
cancer, and more particularly to gene therapy methods and methods for screening 
compounds that induce expression of the BRCAl gene product. 



BACKGROUND ART 

It will be appreciated by those skilled in the art that there exists a need for a 
more sensitive and less invasive metiiod of early detection and diagnosis of breast 
cancer than those methods currentiy in use. Breast canc^ presents inherent difficulties 
in regard to the ease with which it is detected and diagnosed. This is in contrast to 
detection of some other common cancers, including skin and cervical cancers, the latter 
of which is based on cytomorphologic screening techniques. 

There have been several attempts to develop improved methods of breast cancer 
detection and diagnosis. In the attempts to improve methods of detection and diagnosis 
of breast cancer, numerous studies have searched for oncogene mutations, gene 
amplification, and loss of heterozygosity in invasive breast cancer (Callahan, et al., 
1992; Cheickh, et al., 1992; Chen,et al, 1992; and, Lippman, et al, 1990), However, 
few studies of breast cancer have analyzed gene mutations and/or altered gene 
expression in ductal carcinoma in situ (DCIS). Investigators have demonstrated high 
levels of p53 protein in 13-40% f DCIS lesions employing a monoclonal antibody to 
pS3, and subsequent sequencing demonstrated mutations in several cases (Poller et al, 
1992). The neu/eibB2 gene appears to be amplified in a subset of DCIS lesions (Allred 
et al, 1992; Maguire et al, 1992). Histologic analysis of DCIS cases suggests that 
mutations and altered gene expression events, as well as changes in chromatin and 
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DNA content, occur predominantly in comedo DCIS (Bocker et al, 1992; Killeen ct al, 
1991; and, Komitowsld et al/ 1990)/ which has a rapid rate of local invasion and 
progression to metastasis. Thus, there are presently no reliable marker genes for non- 
comedo DCIS (NCDCIS, hereafter). 

Cancer in humans zppeais to be a multi-stq> process which involves progression 
from pre-malignant to malignant to metastatic disease which ultimately kills the patient. 
Epidemiologic studies in humans have established thzt certain pathologic conditions are 
"pre-malignant" because they are associated with increased risk of malignancy. There 
is precedent for detecting and eliminating pre-invasive lesions as a cancer prevention 
strategy: dysplasia and cardnoma in-situ of the uterine cervix are examples of pre- 
malignancies which have been successfully employed in the prevention of cervical 
cancer by cytologic screening methods. Unfortunately, because the breast cannot b 
sampled as readily as cervix, the development of screening methods for breast pre- 
malignancy involves more complex approaches than cytomorphologic screening now 
currently employed to detect cervical cancer. 

Pre-malignant breast disease is also characterized by an apparent morphological 
progression from atypical hyperplasias, to carcinoma in-situ (pre-invasive cancer) to 
invasive cancer which ultimately spreads and metastasizes resulting in the death of the 
patiCTt. Careful histologic lamination of breast biopsies has demonstrated 
intermediate stages which have acquired some of these characteristics but not others. 
Detailed epidemiological studies have established that different morphologic lesions 
progress at different rates, varying from atypical hyperplasia (with a low risk) to 
comedo ductal carcinoma-in-situ which progresses to invasive cancer in a high 
percentage of patients (London et al, 1991; Page et al, 1982; Page et al, 1985; Page 
et al, 1991; and Page et al, 1978). Family history is also an important risk factor in 
the development of breast cancer and increases the relative risk of these pre-malignant 
lesions (Dupont et al, 1985; Dupont et al, 1993; and, London et al, 1991). Of 
particular interest is non-comedo carcinoma-in-situ which is associated with a greater 
than ten-fold increased relative risk of breast cancer compared to control groups 
(Ottesen et al, 1992; Page et al, 1982). Two thcr reasons besides an increased relative 
risk support the concept that DCIS is pre-malignant: 1) When breast cancer occurs in 
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these patients it regularly occurs in the same region of the same breast where the DCIS 
was found; and 2) DCIS is frequently present in tissue adjacent to invasive breast 
cancer (Ottesen et al, 1992; Schwartz et al, 1992). For these reasons DCIS very likely 
rq>resents a rate-limiting st^ in the development of invasive breast cancer in women. 

DCIS (sometimes called intraductal carcinoma) is a group of lesions in 
which the cells have grown to completrfy fill the duct with pattCTns similar to invasive 
cancer, but do not invade outside the duct or show metastases at presentation. DCIS 
occurs in two forms: comedo DCIS and non-comedo DCIS. Comedo DCIS is often a 
grossly palpable lesion which was probably considered "cancer" in the 19th and early 
20th century and progresses to cancer (without definitive therapy) in at least 50% of 
patients within three years (Ottesen et al, 1992; Page et al, 1982). Most of the 
molecular alterations which have been reported in pre-malignant breast disease have 
been observed in cases of comedo DCIS (Poller etal, 1993; Radford et al, 1993; and, 
Tsuda et al, 1993). Non-comedo DCIS is detected by microscopic analysis of breast 
aspirates or biopsies and is associated with a 10 fold increased risk of breast cancer, 
which corresponds to a 25-30% absolute risk of breast cancer within 15 years (Ottesen 
et al, 1992; Page et al, 1982; and, Ward et al, 1992). 

Widespread application of mammography has changed the relative incidrace of 
comedo and non-comedo DCIS such that NCDCIS now represents the predominant 
form of DCIS diagnosed in the United States (Ottesen et al, 1992; Page et al, 1982; and 
Pierce et al, 1992). Both forms of DCIS generally recur as invasive cancer at the same 
site as the pre-malignant lesion (without definitive therapy). The precursor lesions to 
DCIS are probably atypical ductal hyperplasia and proliferative disease without atypia 
which are associated with lower rates of breast cancer development, but show further 
increased risk when associated with a family history of breast cancer (Dupont et al, 
1985; Dupont et al, 1989; Dupont et al, 1993; Lawrence, 1990; London et al, 1991; 
Page et al, 1982; Page et al, 1985; Page et al, 1991; Page et al, 1978; Simpson et al, 
1992; Solin et al, 1991; Swain, 1992; Weed et al, 1990). 

What is needed, then, is a sensitive method of detection and diagnosis of breast 
cancer when the cancerous cells are still in the pre-invasive stage. To illustrate the 
usefulness in early breast cancer detection of a marker gene and its encoded protein. 
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consider the dramatic impact that prostate specific antigen has had on early stage 
prostate cancer. This method of early detection and diagnosis of breast cancer is 
presently lacldng in the prior art« 

Breast cancer occurs in hereditary and sporadic forms. Recently the BRCA 1 
gene has been cloned and shown to be mutated in kindreds with hereditary breast and 
ovarian cancer (Hall et al. 1990, Mild, Y. et al. 1994, Friedman et al. 1994, Castilla 
et al. 1994, Simard et al. 1994). Although 92% of families with two or more cases of 
early-onset breast cancer and two cases of ovarian cancer have germ-line mutations in 
BRCA 1 (Narod et al. in press), the gene has not been shown to be mutated in any 
truly sporadic case to date (Futreal et al. 1994). Despite the suiprising paucity of 
somatically acquired mutations in sporadic breast cancer, it is stUl a likely tumor 
suppressor gene with a key role in breast epithelial cell biology. The BRCA 1 gene 
encodes a protein of 1863 amino acids with a predicted zinc finger domain observed 
in proteins which regulate gene transcription. Until the discovery of the function of the 
BRCAl gene in conjucntion with the delopment of the present invention, the function 
was unknown. 

DISCLOSURE OF THE TNVFNTTOM 

Epidemiologic studies have established that NCDCIS of the breast is associated 
with a ten-fold increased risk of breast cancer (absolute risk of 25-30%). It seems 
likely that this pre-invasive lesion is a determinate precursor of breast cancer because 
the subsequent development of breast cancer is regularly in the same region of the same 
breast in which the NCDCIS lesion was found. Important aspects of the present 
invention concern isolated DNA segments and those isolated DNA segments inserted 
into recombinant vectors encoding differentially expressed marker g«ies in abnormal 
tissue, specifically in NCDCIS, as compared with those expressed in normal tissue, and 
the creation and use of recombinant host cells through the application of DNA 
technology, which express these differentially expressed marker genes (Sambrook et al, 
1989). 

Because there are no cell lines or animal models which clearly display known 
characteristics of pre-invasive breast disease, human breast tissue samples are essential 
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for studying pie-invasive bieast disease. Using human tissue samples, we subsequently 
have developed a method for cDNA cloning from histologically identified lesions in 
human breast biopsies/ We have used this method to clone gmes which are 
differaitiaUy ejqiressed in pre-invasive breast lesions such as NCDCIS lesions as 
compared to genes expressed in normal tissue. The differentially expressed gmes 
detected in pre-invasive breast cancer are called marker genes. Identification of marker 
genes for pre-invasive breast disease provides improved methods for detection and 
diagnosis of pre-invasive breast cancer tissue, and further provides marker genes for 
studies of the molecular events involved in progression from pre-invasive to malignant 
breast disease. 

Analysis of marker gene expression in NGDCIS presents the advantage that 
cancerous breast tissue at that stage is non-invasive. Detection and diagnosis of 
NCDCIS by means of differentially expressed marker genes compared to the same 
marker genes in normal breast tissue, would allow a greater ability to detect, prevent 
and treat the disease before it becomes invasive and metastasizes. The stage or 
intermediate condition of NCDCIS is a particularly good candidate for early 
intervention because it is 1) prior to any invasion and thus prior to any threat to life; 
2) it is followed by invasive carcinoma in over 30% of cases if only treated by biopsy; 
and, 3) there is a long "window" of opportunity (4-8 years) approximately before 
invasive neoplasia occurs. Thus, NCDCIS is an ideal target for early diagnosis. While 
these morphologically defined intermediate endpoints have been widely accepted, 
progress in defining the molecular correlates of these lesions has been hampered by an 
inability to identify and sample them in a manner which would allow the application of 
molecular techniques. 

Frozen tissue blocks from breast biopsies were used to construct and screen 
cDNA libraries prepared from NCDCIS tissue, normal breast tissue, breast cancer 
tissue, and normal human breast epithelial cells. Several cDNAs which were 
differentially expressed in human DCIS epithelial cells compared to normal breast 
epithelial cells w re cloned and sequenced. One gene which is differentially expressed 
is the M2 subunit of RibRed which is expressed at low levels in human breast epithelial 
cdls but at higher levels in 4 ut of 5 DCIS tissue samples. It is presumed that the 
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altered moiphologic zppeaiancc and determinant biologic behavior of DCIS results from 
altered expression of gmes (sadti as RibRed) which is important in the induction of 
breast cancer in humans. 

This invration, therefore, provides a method of detecting and diagnosing pre- 
invasive breast cancer by analyzing marker genes which are differentially eicpressed in 
non-comedo DCIS cells. Histopathologic studies have demonstrated that these 
morphologic patterns in breast tissue lead to invasive breast cancer in at least 20-30% 
of patients. The present method analyzes gene expression in normal, pre-malignant and 
malignant breast biopsies; and, it allows simultaneous comparison and cloning of 
marker genes which are difTeroitially expressed in pre-invasive breast cancer. These 
marker genes can then be used as probes to devdop other diagnostic tests for the early 
detection of pre-invasive breast cancer. 

The present invention concems DNA segments, isolatable from both normal and 
abnormal human breast tissue, which are free from total genomic DNA. The isolated 
DCIS-1 protein product is the regulatory element of the RibRed enzyme. This and all 
other isolatable DNA segments which are differentially expressed in preinvasive breast 
cancer can be used in the detection, diagnosis and treatment of breast cancer in its 
earliest and most easily treatable stages. As used herein, the term "abnormal tissue" 
refers to pre-invasive and invasive breast cancer tissue, as exemplified by collected 
samples of non-comedo or comedo DCIS tissues. 

As used herein, the term "DNA segment" refers to a DNA molecule which has 
been isolated free of total genomic DNA of a particular species. Therefore, a DNA 
segment encoding a differentially expressed protein (as measured by the expression of 
mRNA) in abnormal tissue refers to a DNA segment which contains differentially 
expressed-coding sequences in abnormal tissue as compared to those expressed in 
normal tissue, yet is isolated away from, or purified free from, total genomic DNA of 
Homo sapiens sapiens. Furthermore, a DNA segment encoding a BRCAl protein 
refers to a DNA segment which contains BRCAl coding sequences, yet is isolated away 
from, or purified free from, total genomic DNA of Homo sapiens sapiens. Included 
within the term "DNA segment", are DNA segments and smaller fragments of such 
segments, and also recombinant vectors, including, for example, plasmids, cosmids. 
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phage, viruses, and the like. 

Similarly, a DNA segment comprising an isolated or purified differentially 
expressed gme or comprising an isolated or pinified BRCAl gene refers to a DNA 
segment including differentially expressed coding sequences or BRCAl coding 
sequraces isolated substantially away from other naturally occurring genes or protein 
encoding sequences. In this respect, the term "g^e" is used for simplicity to refer to 
a functional protein, polypq)tide or pq>tide mcoding unit. As will be understood by 
those in the art, diis functional term includes both genomic sequences and cDNA 
sequences. "Isolated substantially away from other coding sequences" means that the 
gene of interest, in this case, any differentially expressed marker gene or the BRCAl 
gene, forms the significant part of the coding region of the DNA segment, and that the 
DNA segment does not contain large ix>rtions of naturally-occurring coding DNA, such 
as large chromosomal firagmrats or other functional genes or cDNA coding regions. 
Of course, this refers to the DNA segment as originally isolated, and does not exclude 
genes or coding regions later added to the segment by the hand of man. 

In particular embodiments, the invention concerns isolated DNA segments and 
recombinant vectors incorporating DNA sequences which encode diff»entially 
expressed genes in pre-invasive breast cancer, each which includes within its amino 
acid sequence an amino acid sequence in accordance with SEQ ID NO:l, SEQ ID 
NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID 
NO:7, all seq id no:s 1-7 are derived from non-comedo DCIS samples from Homo 
sapiens sapiens. In other particular embodiments, the invaition concerns isolated DNA 
segments and recombinant vectors incorporating DNA sequences which encode the M2 
subunit of human RibRed that includes within its amino acid sequence the similar amino 
acid sequence of hamster RibRed corresponding to the M2 subunit of hamster RibRed. 

In certain embodiments, the invention concerns isolated DNA segments and 
recombinant vectors which partially or wholly encode a protein or peptide that includes 
within its anuno acid sequence an amino acid sequence essentially as partially or wholly 
encoded, respectively, by SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID N0:7. Naturally, where the DNA 
segment or vector racodes a fiill length differentially expressed protein, or is intended 
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for use in «cpressing the differentially opressed protein, Ae most preferred sequences 
are those which are essentially as set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID 
N0:3. SEQ ID NO:4, SEQ ID N0:5, SEQ ID NO:6, or SEQ ID NO:7 and which 
oicode a protein that exhibits differential expression, e.g. , as may be determined by the 
differential display or differential sequencing assay, as disclosed herdn. 

The term "a sequence essentially as set forth in SEQ ID NO: 1, SEQ ID NO:2, 
SEQ ID NO:3. SEQ ID NO:4, SEQ ID NO:5, SEQ ID N0:6, or SEQ ID NO:?" 
means that the sequence substantially correqxmds to a portion of SEQ ID N0:1, SEQ 
ID N0:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID 
NO: 7, respectively, and has relatively few nucleotides which are not identical to, or a 
biologically functional equivalent of, the nucleotides of the respective SEQ ID NO:l, 
SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ 
ID N0:7. The term "biologically functional equivalent" is weU understood in the art 
and is further defined in detaU herein, for example see pages 24 through 25. 
Accordingly, sequences which have between about 70% and about 80%; or more 
preferably, between about 81% and about 90%; or even more preferably, between 
about 91% and about 99%; of amino acids which are identical or functionally 
equivalent to the amino acids of SEQ ID N0:1, SEQ ID NO:2, SEQ ID N0:3, SEQ 
ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7 will be sequences which 
are "essentially as set forth in SEQ ID N0:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7", respectively. 

In particular embodiments, the invention concerns a drug screening method and 
a gene therapy method that use isolated DNA segments and recombinant vectors 
incorporating DNA sequences which encode a protein that includes within its amino 
acid sequence an amino acid sequence in accordance with SEQ ID NO:49, SEQ ID 
NO:49 derived from breast tissue from Homo sapiens. In other particular 
embodiments, the invention concerns isolated DNA sequences and recombinant DNA 
vectors incorporating DNA sequences wich encode a protein taht includes with its 
amino acid sequence the amino acid sequence of the BRCAl gene product from human 
breast tissue. 
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In certain embodiments, thecxnvration concerns methods using isolated DNA 
segments and recombinant vectors which partially or wholly encode a protein or peptide 
that includes within its amino add sequence an amino add sequence essentially as set 
forth in SEQ ID NO:49. Naturally, where the DNA segment or vector encodes a fiiU 
length BRCAl protein, or is intaided for use in expressing the BRCAl protein, the 
most preferred sequences are those whidi are essmtially as set forth in SEQ ID NO:47 
and which encode a protein that retains activity as a negative growth regulator in human 
breast cells, as may be det^mined by antisense assay, as disclosed herein. 

The term "a sequence essentially as set forth in SEQ ID NO: 1, SEQ ID NO:2, 
SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7" 
means that the sequence substantially corresponds to a portion of SEQ ID NO: 1, SEQ 
ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID 
N0:7, respectively, and has relatively few nucleotides which are not identical to, or a 
biologically functional equivalent of, the nucleotides of the respective SEQ ID NO:l, 
SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ 
ID NO:7. The term "biologically functional equivalent" is well und^stood in the art 
and is further defined in detail herein, for example see pages 24 through 25. 
Accordingly, sequences which have between about 70% and about 80%; or more 
preferably, between about 81% and about 90%; or even more preferably, between 
about 91% and about 99%; of amino acids which are idmtical or functionally 
equivalent to the amino acids of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ 
ID NOr4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7 will be sequences which 
are "essentiaUy as set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:6, or SEQ ID NO:7", respectively. 

The term "a sequence essentially as set forth in SEQ ID NO:49" means that the 
sequence substantially corresponds to a portion of SEQ ID NO:49 and has relatively 
few amino acids which arc not identical to, or a biologically functional equivalent of, 
the nucleotides of SEQ ID NO:49. The tsrm "biologically functional equivalent" is 
well understood in the art and is further defmed in detail herein, for example see pages 
24 through 25. Accordingly, sequences which have between about 70% and about 
80%; or more preferably, between about 81% and about 90%; or even more 
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preferably, between about 91 % and about 99%; of amino adds which are identical or 
functionally equivalent to the amino acids of SEQ ID NO:49 will be sequences which 
are "essentially as set forth in SEQ ID NO:49" . 

In certain other embodiments, the invention concerns isolated DNA segments 
and recombinant vectors that include within their sequence a nucleic acid sequence 
essentially as set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, 
SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7. The term "essoitially as set forth 
in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ED NO:5, SEQ 
ID NO: 6, and SEQ ID NO:7" is used in the same sense as described above and means 
that the nucleic add sequence substantially corresponds to a portion of SEQ ID NO: 1, 
SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ 
ID NO:7» respectively, and has relatively few codons which are not identical, or 
functionally equivalent, to the codons of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7, respectively. 
Again, DNA segments which encode proteins exhibiting differential expression will be 
most preferred. The term "functionally equivalent codon" is used 4ierein to refer to 
codons that encode the same amino acid, such as the six codons for arginine or serine, 
and also refers to codons that encode biologically equivalent amino acids (see Figure 
8). 

In certain other embodiments, the invention concerns a method for screening 
drugs and a gene therapy method which involve the use of isolated DNA segments and 
recombinant vectors that include within their sequence a nucleic acid sequence 
essMitially as set forth in SEQ ID NO:47 and SEQ ID NO:48. The term "essentially 
as set forth in SEQ ID NO:47 and SEQ ID NO:48'* is used in the same sense as 
described above and means that the nucleic acid sequence substantially corresponds to 
a portion of SEQ ID NO:47 and SEQ ID NO:48 respectively, and has relatively few 
codons which are not identical, or functionally equivalent, to the codons of SEQ ID 
NO:47 and SEQ ID NO:48, respectively. Again, DNA segments which encode 
proteins exhibiting the negative regulatory activity of the BRCAl will be most 
preferred. The term "functionally equivalent codon" is used herein to refer to codons 
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that encode the same amino acid, such as the six codons for arginine or serine, and also 
refers to codons that encode biologically equivalent amino acids (see Figure 8). 

It will also be understood that amino add and nucleic add sequences may 
include additional residues, such as additional N- or C-terminal amino adds or 5' or 
3' sequences, and yet still be essentially as set forth in one of the sequences disclosed 
herein, so long as the sequence meets tiie criteria set forth above, including the 
maintraance of biological protein activity where protein expression is concerned. The 
addition of terminal sequences particularly lilies to nucldc add sequences which may, 
for example, include various non-coding sequences flanking either of the S* or 3' 
portions of the coding region or may include various internal sequences, i.e., introns, 
which are known to occur within genes. 

Excepting intronic or flanking regions, and allowing for the degmeracy of the 
genetic code, sequraces which have between about 20% and about 50%; or more 
preferably, between about 50% and about 70%; or even more preferably, between 
about 70% and about 99%; of nucleotides which are identical to the nucleotides of SEQ 
ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, and SEQ ID NO:7 will be sequences which are "essentially as set forth in SEQ 
ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, and SEQ ID NO:7", respectively. Sequences which are essentially the same as 
those set forth in SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ 
ID NO:5, SEQ ID NO:6, and SEQ ID NO:7 may also be functionally defined as 
sequences which are capable of hybridizing to a nucleic add segment containing the 
complement of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID 
NO:5, SEQ ID NO:6, and SEQ ID NO:7, respectively, under relatively stringent 
conditions. Suitable relatively stringent hybridization conditions will be well known to 
those of skill in the art (Sambrook et al, 1989). 

Excepting intronic or flanking regions, and allowing for the degeneracy of the 
genetic code, sequences which have between about 20% and about 50%; or more 
preferably, between about 50% and about 70%; or even more preferably, between 
about 70% and about 99%; of nucleotides which are identical to the nucleotides of SEQ 
ID NO:47 and SEQ ID NO:48 will be sequences which are "essentially as set forth in 
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SEQ ID NO:47 and SEQ ID NO:48", respectively. Sequences which are essentially 
the same as those set forth in SEQ ID NO:47 and SEQ ID NO:48 may also be 
functionally defined as sequraces which are capable of hybridizing to a nucleic add 
segment containing the complemmt of SEQ ID NO:47 and SEQ ID NO:48, 
res(>ectively» under relatively stringent ccHiditions. Suitable relatively stringent 
hybridization conditions will be well known to tiiose of skill in the art (Sambrook et al, 
1989). 

It is also important to understand the molecular events which lead to progression 
from pre-invasive to invasive breast cancer. Breast cancer is a disease that is presumed 
to involve a series of genetic alterations that confer increasing growth independence and 
metastatic capability on somatic cells* Identifying the molecular events that lead to the 
initial development of a neoplasm is therefore critical to understanding the fundamental 
mechanisms by which tumors arise and to the selection of optimal targets for gene 
therapy and chemopreventive agents. As intermediate endpoints in neoplastic 
development, some pre-malignant breast lesions represent important, and possibly 
rate-limiting stq>s in the progression of human breast cancer^ and careful 
epidemiological studies have established the relative risk for breast cancer development 
for specific histologic lesions. In particular, invasive breast cancer develops in the 
region of the previous biopsy site in at least 25-30% of patients following diagnosis of 
non-comedo DdS providing strong evidence that this pre-malignant lesion is a 
determinant event in breast cancer progression. While these morphologically defined 
intermediate endpoints have been widely accepted, progress in defining the molecular 
correlates of these lesions has been hampered by an inability to idratify and sample 
them in a manner which would allow the application of molecular techniques. 

The present invention includes a comparison of gene expression between 
multiple breast tissue biopsy samples as a means to identify differentially expressed 
genes in pre-malignant breast disease compared with normal breast tissue. These 
genetic markers should be extremely useful reagents for early diagnosis of breast 
cancer, and for the delineation of molecular events in progression of breast cancer. 

Identification of gene markers which are expressed in the majority of pre- 
invasive breast cancer tissue samples involves cDNA library prq)aration from both 
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nonnal and abnonxial tissue. This is followed by either a modified differential display 
method or a differmtial screening method to identify differential expression of genes 
which is subsequently ccmfirmed by RT-PCR, nuclease protection assays and in situ 
hybridization of DCIS tissue RNA and control tissue RNAs (Sambrook et al, 1989). 
Use of genetic engineering methods can bias the screening to specifically identify genes 
whose encoded proteins are secreted or are present at the cell surface, in order to find 
proteins which will be useful markers for diagnostic blood tests (secreted proteins) or 
for diagnostic imaging studies (cell surface proteins). 

Thus, the method of the present invention begins with the collection of at least 
one tissue sample by a microscopically-directed collection step in which a punch biopsy 
is obtained exclusively from abnormal tissue which exhibits histological or cytological 
characteristics of pre-invasive breast cancer. Preferably, the sample site will be an 
isolatable tissue structure, such as ductal epithelial cells from pre-invasive breast 
cancer tissue. The mRNA is purified from the sample. Then, a cDNA library is 
prepared from the mRNA purified from the abnormal tissue sample (Sambrook et al, 
1989), 

A normal tissue sample is then obtained from the patient, using a sample site 
from an area of tissue which does not exhibit histological or cytological characteristics 
of pre-invasive cancer. A cDNA library is also prepared from this normal tissue 
sample. 

The abnormal tissue cDNA library can then be compared with the normal tissue 
cDNA library by differential display or differential screening to determine whether the 
expression of at least one marker gene in the abnormal tissue sample is different from 
the expression of the same marker gene in the normal tissue sample. 

Further diagnostic steps can be added to the method by cloning the marker gene 
using sequence-based amplification to create a cloned marker gene which can then be 
DNA-sequenced in order to derive the protein sequence. The protein sequence is then 
used to generate antibodies which will recognize these proteins by antibody recognition 
of the antigen. The presence of the antibody-recognized antigen can then be detected 
by means of conventional medical diagnostic tests. 
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This invention also includes methods of screwing for compounds and gene 
therapy methods using the BRCAl gene. BRCAl mRNA is expressed at 5-10 fold 
higher levels in normal mammary tissue than in invasive breast cancer samples. 
Having demonstrated that mRNA expression levels of BRCAl are higher in normal 
mammary cells than in cancer cells, antisense methods were used to test the hypothesis 
that BRCAl expression inhibits cell growth. These tests showed that diminished 
expression of BRCAl increased the proliferative rate of breast cells. 

An object of the present invention, then, is to provide a method of early 
detection of pre-invasive breast cancer in human tissue. 

It is a further object of this invention to idmtify early marker genes for pre- 
invasive breast disease which can be used in scre^iing methods for early pre-invasive 
breast cancer. 

It is also an object of this invention to produce a cDNA library from pre- 
invasive breast cancer tissue resulting in a permanent genetic sample of that pre- 
invasive breast cancer tissue. 

It is also an object of this invention to provide a drug or biological screraing 
method using the BRCA 1 promoter region and gene therapy method using the BRCA 
1 gene. 



MCF-7 



TPA 



List of Abbreviations 
Phorbol 12-myristate 13-acetate 

An immortalized cell line derived from a metastasis of 



human breast cancer 



HMEC 



A primary (non-immortalized) cell line derived from 

breast epithelial cells obtained during reduction 

mammoplasty 

Ductal Carcinoma-in-situ 



DCIS 



NCDC 



Non-Comedo Ductal Carcinoma in situ 



cDNA 



Complementary DNA obtained from an RNA template 
Deoxyribonucleic Acid 

Reverse Transcriptase-Polymerase Chain Reaction 
Ribonucleotide Reductase 



DNA 



RT-PCR 



RibRed 
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Fig. 1 shows Table I which describes anatomic lesion types in the human breast 
with pre-malignant implication. 

Fig. 2 shows a model for pre-malignant conditions, highlighting magnitude of 
risk for progression to clinical malignancy. 

Fig. 3 contains color photos of DCIS tissue, before (upper left panel) and after 
microscopically-directed excisional punch biopsy (upper right panel). The lower panels 
show tissue samples of normal breast tissue (lower left panel), and invasive breast 
cancer (lower right panel). 

Fig. 4 shows expression of collagen HI mRNA in tissue mRNA samples, 
analyzed by RNase protection assay methods. 

Fig. 5 shows differential display of cDNAs obtained from patient tissue samples 
and controls. 

Fig. 6 shows a compariscm of the sequence between DCIS-1 and the human and 
hamster genes. 

Fig. 7 shows expression of DCIS-1 mRNA in tissue mRNA samples analyzed 
by RNase protection assay as described in the legend to Figure 4. 

Fig. 8 is Table n which displays the genetic code. 

Fig. 9 is a Table which lists differentially expressed marker genes. 

Figs. lOA and lOB shows expression of BRCAl mRNA during breast cancer 
progression by PGR detection and nuclease protection assay, respectively. 

Figs. IIA and 1 IB is a comparison of BRCAl expression in normal breast and 
invasive breast cancer using nuclease protection assay of RNA, respectively. 

Figs. 12A, 12B, and 12C show that antisense inhibition of BRCAl accelerates 
mammary cell proliferation. 

Figs. 13A and 13B includes a Northern blot of mRNA and nuclear runon studies 
that show that ribonucleotide reductase M2 mRNA is cell cycle regulated in MCF-7 
cells. 

Fig. 14 includes a nuclease protection assay that shows that antisense inhibition 
of BRCAl in human mammary cells decreases BRCAl mRNA and increases 
ribonucleotide reductase mRNA. 
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UnLITY STATEMENT 

The detection of differentially expressed genes in pre-invasive breast tissue, 
spedHcally in non-comedo ductal cardnoxna in situ as compared to genes expressed in 
normal tissue, is useful in the diagnosis, prognosis and treatment of human breast 
S cancer. Such differentially expressed genes are effective markn genes indicating the 

significantly increased risk of breast canc^ in a patient expressing these differentially 
repressed marker genes. These marker genes are useful in the detection, early 
diagnosis, and treatment of breast cancer in humans. 

The discovery of the function of the BRCA 1 gene has broad utility including, 
10 in the present invention, development of methods to treat familial and sporadic breast 

<^cers as well as screen for therapeutic drugs through production of important 
indicator compoimds. 



ACTIVITY STATEMENT 

15 Of the difE^entially expressed genes described in this invention, DCIS-1 

encodes a gene similar to the M2 subunit of hamster ribonucleotide reductase. The 
M2 subunit of ribonucleotide reductase (RibRed, hereafter) is responsible for regulation 
of RibRed. The diffi^ntial levels of expression of the marker genes described in this 
invention (Seq ID No.s 1-7), indicate genetic changes which have been linked to the 

20 presence of pre-invasive breast cancer. 

The BRCAl gene (Seq. ID No. 47) is differentially expressed in invasive breast 
cancer cells. The BRCAl gene product is a negative regulator of mammary cell 
proliferation which is expressed at diminished levels in sporadic breast cancer. 

25 BEST MODE FOR CARRYING OUT THE INVENTION 

For the purposes of the subsequent description, the following definitions will be 

used: 

Nucleic acid sequences which are "complementary" are those which are capable 
of base-pairing according to the standard Watson-Crick complementarity rules. That 
30 is, that the larger purines will always base pair with the smaller pyrimidines to form 

only combinations of Guanine paired with Cytosine (G:C) and Adenine paired with 
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either Thymine <A:T) in the case of DNA or Adenine paired with Uracil (A:U) in the 
case of RNA. 

"Hybridization techniques'' refw to molecular biological techniques which 
involve the binding or hybridization of a probe to complementary sequences in a 
5 polynucleotide. Included among these techniques are northrai blot analysis, southern 

blot analysis, nuclease protection assay, etc. 

"Hybridization" and "binding" in the context of probes and denatured DNA are 
used interchangeably. Probes which are hybridized or bound to dmatured DNA are 
aggregated to complemaitary sequences in the polynucleotide. Whether or not a 
10 particular probe remains aggregated with the polynucleotide depends on the degree of 

complementarity, the length of the probe, and the stringency of the binding conditions. 
The higher the stringency, the higher must be the degree of complementarity and/or the 
longer the probe. 

"Probe" refers to an oligonucleotide or short fragment of DNA designed to be 
IS sufficiently complementary to a sequence in a denatured nucleic acid to be probed and 

to be bound under selected stringency conditions. 

"Label" refers to a modification to the probe nucleic acid that enables the 
experimenter to identify the labeled nucleic acid in the presence of unlabeled nucleic 
acid. Most commonly, this is the replacement of one or more atoms with radioactive 
20 isotopes. However, other labels include covaiently attached chromophores, fluorescent 

moeities, enzymes, antigens, groups with specific reactivity, chemiluminescent 
moeities, and electrochemically detectable moeities, etc. 

"Marker gene" refers to any gene selected for detection which displays 
differential expression in abnormal tissue as opposed to normal tissue. It is also 
25 referred to as a differentially expressed gene. 

"Marker protein" refers to any protein encoded by a "marker gene" which 
protein displays differratial expression in abnomial tissue as opposed to normal tissue. 
"Tissuemizer" describes a tissue homogenization probe. 
"Abnormal tissue" refers to pathologic tissue which displays cytologic, 
30 histologic and other defining and derivative features which di^er firom that of normal 
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tissue. This includes in the case of abnormal breast tissue, among others, pre-invasive 
and invasive neoplasms. 

"Normal tissue" refers to tissue which does not display any pathologic traits. 

"PGR technique" describes a method of gene amplification which involves 
5 sequenced-based hybridization of primers to specific genes within a DNA sample (or 

library) and subsequent amplification involving multiple rounds of annealing, elongation 
and draaturation using a heat-stable DNA polymerase. 

"RT-PCR" is an abbreviation for reverse transcriptase-polymerase chain 
reaction. Subjecting mRNA to the reverse transcriptase enzyme results in the 
10 production of cDNA which is complementary to the base sequences of the mRNA. 

Large amounts of selected cDNA can then be produced by means of the polymerase 
chain reaction which relies on the action of heat-stable DNA polymerase produced by 
Thennus aquaticus for its amplification action. 

"Microscopically-directed" refers to the method of tissue sampling by which the 
IS tissue sampled is viewed under a microscope during the sampling of that tissue such 

that the sampling is precisely limited to a given tissue type, as the investigator requires. 
Specifically, it is a collection step which involves the use of a punch biopsy instrument. 
This surgical instrument is stereotactically manually-directed to harvest exclusively from 
abnormal tissue which exhibits histologic or cytologic characteristics of pre-invasive 
20 cancer. The harvest is correlated with a compxanion slide, stained to recognize the 

target tissue. 

"Differential display" describes a method in which expressed genes are 
compared between samples using low stringency PGR with random oligonucleotide 
primers. 

25 "Differential screening" describes a method in which genes within cDNA 

libraries are compared between two samples by differential hybridization of cDNAs to 

probes prepared from each library. 

"Nuclease protection assay" refers to a method of RNA quantitation which 

employs strand specific nucleases to identify specific RNAs by detection f duplexes. 
30 "Differential expression" describes the phenomenon of differential genetic 

expression seen in abnormal tissue in comparison to that seen in normal tissue. 
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"Isolatable tissue structure" refers to a tissue structure which when visualized 
microscopically or otherwise is able to be isolated from other different surrounding 
tissue types. 

"In situ hybridiration of RNA" refers to the use of labeled DNA probes 
employed in conjunction with histological sections on which RNA is present and with 
which the labeled probe can hybridize allowing an investigator to visualize the location 
of the specific RNA within the cell. 

"Comedo DCIS cdls" refers to cells comprising an in situ lesion with the 
combined features of highest grade DCIS. 

"Non-comedo DCIS cells" refers to ceUs of DCIS lesions without comedo 
features. 

"Cloning" describes separation and isolation of single genes. 

"SequOTdng" describes the determination of the specific order of nucldc acids 
in a gene or polynucleotide. 

The present invention provides a method for detecting and diagnosing cancer by 
analyzing marker genes which are differentially expressed in early, pre-invasive breast 
cancer, specifically in non-comedo DCIS cells. Our histopathologic studies have 
demonstrated that certain morphologic patterns in breast tissue are pre-malignant, 
leading to invasive breast cancer in at least 20-30% of patients. We have developed 
a new method for analyzing gene expression in normal, pre-malignant and malignant 
breast biopsies which allows simultaneous comparison and cloning of marker gents 
which arc differentially pressed in pre-invasive breast cancer. These marker genes 
(which appear as differentially expressed genes in pre-invasive breast cancer) can be 
used as probes to develop diagnostic tests for the early detection of pre-invasive breast 
cancCT (Sambrook, 1989). 

The present invration thus comprises a method of identification of marker g«ies 
which are expressed in the majority of pre-invasive breast cancer tissue samples. It 
involves cDNA library preparation followed by a modified differential display method. 
Use of genetic engineering methods (Sambrook, 1989) can bias the screening to 
specifically identify genes whose encoded proteins are secreted or arc present at the cell 
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surface, in order to find proteins which will be useful markers for diagnostic blood tests 
(secreted proteins) or for diagnostic imaging studies (cell surface proteins). 

Naturally, the present invention also encompasses DNA segments which are 
complementary, or essentially complemmtary, to the sequence set forth in SEQ ID 
NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, 
SEQ ID NO:7, SEQ ID NO:47 and SEQ ID NO:48. Nucleic add sequences which are 
**complementary" are those which are capzblt of base-pairing according to the standard 
Watson-Crick complementarity rules. As used herein, the term "complementary 
sequences" means nucleic acid sequences which are substantially complementary, as 
may be assessed by the same nucleotide comparison set forth above, or as defined as 
being capable of hybridizing to the nucleic acid segment of SEQ ID NO:l, SEQ ID 
NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID N0:5, SEQ ID NO:6, SEQ ID NO:7, 
SEQ ID NO:47 and SEQ ID NO:48 under relatively stringent conditions such as those 
described herein. 

The nucleic acid segments of the present invention, regardless of the length of 
the coding sequence itself, may be combined with other DNA sequences, such as 
promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning 
sites, other coding segments, and the like, such that their overall length may vary 
considerably. It is therefore contemplated that a nucleic add fragment of almost any 
length may be employed, with the total length preferably being limited by the ease of 
preparation and use in the intended recombinant DNA protocol. For example, nucleic 
acid fragments may be prepared which include a short stretch complementary to SEQ 
ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, SEQ ID NO:7, SEQ ID NO:47 and SEQ ID NO:48, such as about 10 
nucleotides, and which are up to 10,000 or 5,000 base pairs in length, with segments 
of 500 being preferred in most cases, DNA segments with total lengths of about 1 ,000, 
500, 200, 100 and about 50 base pairs in length are also contemplated to be useful. 

It will also be understood that this invention is not limited to the particular 
nucleic acid and amino acid sequences of SEQ ID NO:l, SEQ ID NO:2, SEQ ID 
NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:47, 
SEQ ID NO:48, and SEQ ID NO:49. Recombinant vectors and isolated DNA 
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segments may thmfoie variously include the diffeientiaUy expressed coding regions 
or the BRCAI coding regions themselves, coding regions bearing selected alterations 
or modifications in the basic coding region, or they may encode laiger polypeptides 
which nevertheless include diffemtially expressed-coding regions and the BRCAI 
S coding regions or may encode biologically functional equivalent proteins or peptid s 

which have variant amino adds sequmces. 

The DNA segments of the presmt invention encompass biologically functional 
equivalmt differentially expressed proteins and peptides biologically functional 
equivalent proteins of BRCAI. Such sequences may arise as a consequence of codon 

10 redundancy and functional equivaloicy which are known to occur naturally within 

nucleic add sequences and the proteins thus encoded. Alternatively, functionally 
equivalent proteins or peptides may be created via the application of recombinant DNA 
technology, in which changes in the protein structure may be engineered, based on 
considerations of the properties of the amino acids being exdianged. Changes designed 

15 by man may be introduced through the application of site-directed mutagenesis 

techniques, e.g., to introduce improvements to the antigraicity of the protein or to test 
site-directed mutants or others in order to examine carcinogenic activity of the 
differmtially expressed marker genes at the molecular level. 

If desired, one may also prepare fusion proteins and peptides, e.g., where the 

20 differentially expressed marker gene coding regions are aligned within the same 

expression unit with other proteins or peptides having desired functions, such as for 
purification or immunodetection purposes (e.g., proteins which may be purified by 
affinity chromatography and enzyme label coding regions, respectively). 

Recombinant vectors form important further aspects of the present invention, 

25 Particularly useful vectors are contemplated to be those vectors in which the coding 

portion of the DNA segment is positioned under the control of a promoter. The 
promoter may be in the form of the promoter which is naturally associated with a 
RIBRED gene, e.g., in human cells, as may be obtained by isolating the 5' non-coding 
sequences located upstream of the coding segment or exon, for example, using 

30 recombinant cloning and/or PGR technology, in connection with the compositions 

disclosed herein. 
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In other embodiments, it is contemplated that certain advantages will be gained 
by positioning the coding DNA segment under the control of a recombinant, or 
heterologous, promoter. As used h»^, a recombinant or heterologous promoter is 
intended to refer to a promoter that is not normally associated with a differentially 
e^qxressed marker gene or the BRCAl gene in its natural ravironment. Such promoters 
may include MMTV promoters normally associated with other genes, and/or promoters 
isolated from any other bacterial, viral, eukaryotic, or mammalian cell. Naturally, it 
will be important to employ a promoter that effectively directs the e^qxression of the 
DNA segment in the cdl type chosen for expression. The use of promoter and cell 
type combinations for protein expression is graerally known to those of skill in the art 
of molecular biology, for example, see Sambrook et al. (1989). The promoters 
employed may be constitutive, or inducible, and can be used under the appropriate 
conditions to direct high level expression of the introduced DNA segment, such as is 
advantageous in the large-scale production of recombinant proteins or peptides. 
Appropriate promoter systems contemplated for use in high-level expression include, 
but are not limited to appropriate bacterial promoters. 

As mentioned above, in connection with expression embodiments to prepare 
recombinant differentially expressed marker gene encoded proteins and peptides, it is 
contemplated that longer DNA segments will most often be used, with DNA segments 
oicoding the entire differentially expressed protein or subunit being most preferred. 
However, it will be appreciated that the use of shorter DNA segments to direct the 
expression of differentially expressed peptides or epitopic core regions, such as may 
be used to generate anti-marker protein antibodies, also falls within the scope of the 
invention (Harlow et al, 1988). 

DNA segments which mcode peptide antigens from about 15 to about 50 
amino acids in length, or more preferably, from about 15 to about 30 amino adds in 
length are contemplated to be particularly useful. The C terminus of proteins provide 
an excellent region for peptide antigen recogition (Harlow et al, 1988). DNA segments 
encoding peptides will generally have a minimum coding length in the order of about 
45 to about 147, or to about 90 nucleotides. DNA segments encoding partial length 
peptides may have a minimum coding length in the order of about 50 nucleotides for 
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a polyp^tide in accordance with seq id no:3, or about 264 nucleotides for a 
polypeptide in accordance with SEQ ID NO: 1. 

In addition to their use in directing the expression of the differentially expressed 
marker protdns, the nucleic add sequences disclosed herein also have a variety of other 
uses. For example, they also have utility as probes or primers in nucleic acid 
hybridization embodimaits. As such, it is contemplated that oligonucleotide fragments 
corresponding to the sequences of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3, SEQ 
ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7 for stretches of between 
about 10 to 15 nucleotides and about 20 to 30 nucleotides will find particular utility. 
Longer complemoitary sequmces, e.g., those of about 40, 50, 100, 200, 500, 1000, 
and even up to full length sequences of about 2,000 nucleotides in length, will also be 
of use in certain embodimCTts. 

The ability of such nucleic acid probes to specifically hybridize to differentially 
expressed mark^ gene sequences will enable them to be of use in detecting the 
presence of complementary sequences in a givoi sample. However, other uses are 
envisioned, including the use of the sequence information for the preparation of mutant 
species primers, or primers for use in pr^)aring other genetic constructions. 

Nucleic acid molecules having stretches of 20, 30, 50, or even of 500 
nucleotides or so, complementary to SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7 are particularly 
contemplated as hybridization probes for use in, e.g.. Southern and Northern blotting. 
This would allow differentially expressed structural or regulatory genes to be analyzed, 
both in patients and sample tissue from pre-invasive and invasive breast tissue. The 
total size of fragmmt, as well as the size of the complementary stretch(es), will 
ultimately depend on the intended use or application of the particular nucleic acid 
segment. Smaller fragments will generally find use in hybridization embodiments, 
wherein the length of the complementary region may be varied, such as between about 
10 and about 100 nucleotides, but larger complementary stretches of up to about 300 
nucleotides may be used, according to the length complementary sequences one wishes 
to delect. 
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Nucleic Acid Hybridization . 

The use of a hybridization probe of about 10 nucleotides in length allows the 
formation of a duplex molecule that is both stable and selective, t Molecules having 
complemmtary sequmces over stretdies greater than 10 bases in length are generally 
preferred, though, in order to increase stability and selectivity of the hybrid, and 
thereby improve the quality and degree of specific hybrid molecules obtained. One will 
generally prefer to design nucldc add molecules having gene-complementary stretches 
of 15 to 20 nucleotides, or even longer where desired. 

Hybridization probes may be selected from any portion of any of the sequences 
disclosed herein. All that is required is to review the sequences set forth in SEQ ID 
NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, 
and SEQ ID NO:7 and to select any continuous portion of one of the sequmces, from 
about 10 nucleotides in length up to and including the fiill length sequence, that one 
wishes to utilise as a probe or primer. The choice of probe and primer sequences may 
be governed by various factors, such as, by way of example only, one may wish to 
employ primers from towards the termini of the total sequence, or from the ends of the 
functional domain*encoding sequences, in order to amplify further DNA; one may 
employ probes corresponding to the mtire DNA, or to the 5* region, to clone marker- 
type genes from other species or to clone further marker*like or homologous genes 
from any species including human; and one may employ randomly selected, wild-type 
and mutant probes or primers with sequences entered around the RibRed M2 subunit 
mcoding sequence to screen DNA samples for differentially expressed levels of 
RibRed, such as to identify human subjects which may be expressing differential levels 
of RibRed and thus may be susceptible to breast cancer. 

The process of selecting and preparing a nucleic acid segment which includes 
a sequence from within SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, 
SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7 may alternatively be described as 
"preparing a nucleic acid fragment". Of course, firagments may also be obtained by 
other techniques such as, e.g., by mechanical shearing or by restriction enzyme 
digestion. Small nucleic acid segments or firagments may be readily prepared by, for 
example, directiy synthesizing the fragment by chemical means, as is commonly 
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practiced using an automated oligonucleotide synthesizer. Also, fragments may be 
obtained by application of nucleic acid reproduction technology, such as the PGR 
technology of U;S. Patent 4,603,102 (incorporated herein by reference), by introducing 
selected sequences into recombinant vectors for recombinant production, and by other 
recombinant DNA techniques generally known to tiiose of skill in the art of molecular 
biology. 

Accordingly, the nucleotide ^uences of the invention may be used for their 
ability to selectively form duplex molecules with complementary stretches of 
differentially e^ressed marker genes or cDNAs. Depending on the plication 
envisioned, one will desire to employ varying conditions of hybridization to achieve 
varying degrees of selectivity of probe towards target sequmce. For applications 
requiring high selectivity, one will typically desire to employ relatively stringent 
conditions to form the hybrids, e.g., one will select relatively low salt and\or high 
temperature conditions, such as provided by 0-02M-0. 15M NaCl at temperatures of 
50**C to 70'*C. Such selective conditions tolerate little, if any, mismatch between the 
probe and the template or target strand, and would be particularly suitable for isolating 
specific differentially expressed marker genes. 

Of course, for some applications, for example, where one desires to prepare 
mutants employing a mutant primer strand hybridized to an underiying template or 
where one seeks to isolate marker gene sequences from related species, functional 
equivalents, or the like, less stringent hybridization conditions will typically be needed 
in order to allow formation of the heteroduplex. In these circumstances, one may 
desire to employ conditions such as 0.15M-0.9M salt, at temperatures ranging from 
20*'C to 55 *C. Cross-hybridizing species can thereby be readily identified as positively 
hybridizing signals with respect to control hybridizations. In any case, it is generally 
appreciated that conditions can be rendered more stringent by the addition of increasing 
amounts of formamide, which serves to destabilize the hybrid duplex in the same 
manner as increased temperature. Thus, hybridization conditions can be readily 
manipulated, and thus will generally be a method of choice depending on the desired 
results. 
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In certain embodiments, it will be advantageous to employ nucleic add 
sequences of the present invention in combination with an appropriate means, such as 
a label, for determining hybridization. A wide variety of appropriate indicator means 
are known in the art, including fluorescent, radioactive, enzymatic or other ligands, 
such as avidin/biotin, which are capable of giving a detectable signal. In preferred 
embodiments, one will likely desire to raiploy a fluorescent label or an enzyme tag, 
such as urease, alkaline j^osphatase or peroxidase, instead of radioactive or other 
environmental undesirable reagents. In the case of enzyme tags, colorimetric indicator 
substrates are known which can be employed to provide a means visible to the human 
eye or spectrpphotometrically, to identify specific hybridization with complementary 
nucleic acid-containing samples. 

In general, it is envisioned that the hybridization probes described herein will 
be useful both as reagents in solution hybridization as well as in embodiments 
employing a solid phase. In embodiments involving a solid phase, the test DNA (or 
RNA) is adsorbed or otherwise affixed to a selected matrix or surface. This fixed, 
single-stranded nucleic acid is then subjected to specific hybridization with selected 
probes under desired conditions. The selected conditions will depend on the particular 
circumstances based on the particular criteria required (depending, for example, on the 
G+C contents, type of target nucleic add, source of nucleic add, size of hybridization 
probe, etc.). Following washing of the hybridized surface so as to remove 
nonspecifically bound probe molecules, spedfic hybridization is detected, or even 
quantified, by means of the label. (Sambrook et al, 1989). 

In a preferred embodiment of the method, certain preliminary procedures are 
necessary to prepare the sample tissue and the probes before the detection of differential 
expression of marker genes in abnormal tissue as compared to that in normal tissue can 
be accomplished. 

SAMPLE PREPARATION 

RNA purification 

RNA was isolated from frozen tissue samples by mincing of microdisected 
frozen tissue fragments with a razor blade and then adding 800 microliter of 5.6M 
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guanidinium to increase mixing, followed by a 30 second microcentrifuge centrifogation 
at 14,(XX) ipm to remove particulate matter. The supernatant was then removed and 
the viscosity was reduced by multiple aspirations through a 22 gauge needle and then 
200 ul of chloroform was added and the sample was incubated on ice for 15 minutes 
(during this time the sample was vortoced multiple times). Following incubation with 
chloroform, the sample was centriftiged for 15 minutes at 14,000 rpm and the aqueous 
layer was removed and ethanol precipitated. This extraction method produces RNA 
which is primarily derived from cdls of epithelial origin. In order to obtain RNA 
samples which piesumably includes RNA derived from these stromal cells; the 
particulate material (remaining in tiie pellet from the 30 second centrifugation) was 
homogenized with a tissuemizer, washed with PBS, treated with collagenase at 37°C 
for 30 minutes, sonicated, extracted with phenol/chloroform and ethanol predintated. 

cDNA libraries were constructed in lambda phage using polyA-sdected mRNA 
from the following samples; cultured human breast epithelial cdls, tissue from three 
reduction mammoplasty patients, tissue from three DCIS patients, and tissue from on 
DCIS patioit (patient #10) that showed a focus of microinvasion adjacent to an area of 
DCIS. Multiple punches were needed to obtain sufficient RNA for poly A selection and 
library construction. 200 ug of total RNA was obtained by pooling 20 punches from 
normal breast tissue (reduction mammoplasty samples) and 5-8 punches from DCIS 
lesions, presumably reflecting the greater cellularity of the DCIS samples. cDNA 
libraries were constructed by first and second strand cDNA synthesis followed by the 
addition of directional synthetic linkers (ZAP-cDNA Synthesis Kit, Stratagene, La 
JoUa, California). The Xho I-Eco Rl tinkered cDNA was then ligated into lambda 
arms, packaged with packaging ^tracts, and then used to infect XLl-blue bacteria 
resulting in cDNA libraries. 

FROBE PREPARATION 
The collagen m probe employed for nuclease protection assays was constructed 
by subcloning the 208 bp Hinc H-Pst I fragment from the 3' untranslated region of tiie 
human type III procollagen gene into pGem4Z. This region of the human procollagen 
m gene was obtained by PCR amplification of published sequence (Ala-Kokko et al, 
1989) foUowed by restriction witii Hinc n and Pst I. For a control probe to assure 
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equal loading and recovery of RNA, we used a TT^polymerase-generated probe for 
human glyceraldehyde phosphate dehydrogenase (GADP) which protects a 140 bp Sac 
I-Xba I fragment; (a generous gift from Janice Nigro, VanderbUt University). Probe 
DCIS-1 was generated by linearizing the rescued plasmid with Pvu II, which should 
generate a 200 bp proterted ftagment. RNase protection assays were performed with 
1 ug of unselected RNA and the above-cited probes using the methods we have reported 
previously (Holt, 1993). 
Differential Display-based cloning of dJNAs: 

Rescued cDNA library samples were used as templates for low stringency PGR 
with the either a pair of 25 bp primws or an andiored 14 bp primer paired with a 
random 25 bp primer. Random 25 bp primers were generated by a computer-based 
algorithm (Jotte and Holt, unpublished). Samples were denatured for two minutes at 
95 "C followed by 40 cydes, each cycle consisting of denaturation for 1 minute at 
94'C., annealing for 2 minutes at 25'C., and extension for 1 minute at 72'*C. The 
samples were then run on an 6% non-denaturing polyacrylamide gel, which was dried 
and autoradiographed. Specific bands were excised then reamplified with the same 
primers used for their goieration. Specificity was confirmed on 6% polyacrylamide 
gel, and samples were purified by ethanol precipitation of the remainder of the PGR 
reaction. Fragments were then individually cloned into Srfl cut vectors by standard 
methods using PCR-Script™SK(+) Cloning Kit (Stratagene, LaJolla, CaUfomia) and 
then sequenced. 

EXAMPLE 1 

■< ytiidies showing I ncreased Risk of Breast Cancer 
in Patients with DCIS 
Since the 1970's, studies of pre-invasive lesions associated with the development 
of breast canc^ have been undertaken in an attempt to refine histologic and cytologic 
criteria for the hyperplastic lesions analogous to those of the uterine cervix and colon. 
Because f the availability of tissue from breast biopsies done many years previously, 
cohorts of w men who underwent breast biopsies 15 to 20 years ago, can be studied 
to determine the risk for development of breast cancer attributable to specific lesions. 
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Many concurrent studies evaluating lesions associated with cancer at time of cancer 
diagnosis led the way in pointing out lesions of potential interest (Wellings et al, 1975). 
Hopefully, these intermediate stages in cancer devdppmrat will serve to provide 
indicators of breast cancer devdppmmt sufficiently precise to guide prevention and 
intervention strategies (Weed et al, 1990; Uppman et al, 1990). Such intermediate 
elements prior to the development of metastatic capable cancers also provide the 
opportunity to define the molecular biology of these elemrats. Studies of the 
development of pre-invasive breast disease have provided insight into different types 
of lesions witii different implications for breast cancer risk and the process of 
carcinogenesis (See Figure 1). Pre-invasive breast disease is herewith defined to be any 
reproducibly defined condition which confers an elevated risk of breast cancer 
approaching double that of the general population (Komitowski et al, 1990). The 
specifically-defined atypical hyperplasias and lobular carcinoma in situ confer relative 
risks of four to ten times that of the general population. This risk is for carcinoma to 
develop anywhere in either breast (Page et al, 1985; Page et al, 1991). The statistical 
significance of these observations have regularly been <. 0001. Thus, absolute risk 
figures of 10-20% likelihood of developing into invasive carcinoma in 10 to 15 years 
arise. DCIS is a very special element in this story because the magnitude of risk is as 
high as any other condition noted (P< .00005), but remarkably, the developing 
invasive cancer is in the same site in the same breast This local recurrence and 
evolution to invasiveness marks these lesions as determinate precursors of invasive 
breast cancer (Betsill et al, 1978; Page et al, 1982). These figures are for the type of 
DCIS which has become detected very commonly since the advent of mammography, 
the smaU and NCDCIS variety. It is likely that the comedo DCIS variety indicates a 
much greater risk, often presenting as larger lesions, and treated regularly by 
mastectomy in the past 50 years making follow-up studies impossible (Figure 1). 

The precision of histopathologic diagnosis in this area as noted in Table I (shown 
in Figure 1) was most convincingly confirmed in a large, prospective study (London 
et al, 1991). There has also been a recent review of the reproducibility of the 
assignment of diagnosis by a panel of pathologists (Schnitt et al, 1992). The precision 
has been fostered by combining histologic pattern criteria with cytologic and extent of 
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lesion criteria. Classic surgical patfiology criteria were predominantly derived from 
histologic pattern only. A further point of relevance to the importance of these 
histopathologically defined lesions of pre-malignancy in the breast is the relationship 
to femiliality. A family history of breast cancer in a first degree relatives confers about 
a doubling of breast cancer risk. However, women with the atypical hyperplasias at 
biopsy and a family history of breast cancer are at 9*10 times the risk of devdoping 
invasive breast cancer as the general population (Dupont et al, 1985; Di^nt et al, 
1989). 

Careful consideration of all of the above-mentioned epidemiologic data has led 
to the following model for progression from generalized pre-malignant lesions to 
determinant lesions to invasive cancer. Figure 2 shows this model for the induction and 
progression of pre-invasive breast disease based on study of the Vanderbilt cohort 
(Dupont et al, 198S) of more than 10,000 breast biopsies (follow-up rate 8S%; median 
time of 17 years; 135 women developed breast cancer). 

EXAMPLE 2 

Identification of genes which are differentially expressed in DCIS 
Construction of cDNA libraries from DCIS lesions 
In order to study differential gene expression in DCIS, we collected cases of 
NCDCIS. The diagnosis of DCIS is made on histomorphologic grounds based on 
architectural, cytologic, and occasionally extent criteria. NCDCIS lacks comedo 
features aind consists of microscopic intraductal lesions which fill and extend the duct, 
contain rigid internal architecture, and often have hyperchromatic and monomorphic 
nuclei. 

Study of non-comedo DCIS for differential marker gene expression indicates the 
diagnostic utility of comparison of marker gene expression in these tissues. Although 
the morbidity and mortality of breast cancer clearly results from invasion and 
metastasis, the development of breast cancer is clearly significant in its early stages for 
two basic reasons: 

1) The molecular changes will presumably be simpler in early lesions than 
in later lesions which may have acquired numerous mutations or "hits"; 
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and 

2) Successful prevention strategies may require attacking cancer before it 
develops the c^dty to invade or metastasize. 

Non-comedo DCIS is the earliest determinant lesion which recurs locally as 
invasive cancer. Although comedo DCIS may be technically easier to study because 
the tumors are larger, its aggressiveness and the presence of numerous genetic 
alterations (such as pS3 and erbB2) suggest that it may have advanced beyond the 
earliest stages of carcinogenesis. 

The commercial utility of a method for prevmtion of cancer is clear. In order 
to study differential gene expression in DCIS, breast tissue with extensive microscopic 
non-comedo DCIS was idmtified and banked in a frozen state. cDNA libraries were 
constructed from mRNA isolated from frozen sections of DCIS lesions. Tissue samples 
from patients with mammographic results consistent with DCIS were cryostat frozen 
and a definitive diagnosis was made by the histopathologic criteria which we have 
described (Jensen et al. Submitted for publication; Holt et al. In press). 

Control mRNA was obtained from frozen tissue samples obtained from reduction 
mammoplasties and from cultured human breast q>ithelial cells. Because non-comedo 
DCIS is a microscopic lesion, we had to microlocalize regions of DCIS in biopsy 
samples. To accomplish this we prepared frozen sections in which we located regions 
of DCIS and then employed a 2 mm punch to obtain an abnormal tissue sample only 
from those regions that contained DCIS. This selective harvesting was accomplished 
by carefully aligning the frozen section slide with the frozen tissue block and 
identifying areas of interest. The harvest of the appropriate area was then confirmed 
with a repeat frozen section. A similar approach was used to isolate mRNA from 
lobules of normal breast in samples collected from a reduction mammoplasty. Prior 
studies have shown that breast lobules are approximately 2.S mm in diameter, thus the 
2 mm punch provided a well-tailored excision. This microlocation and collection step, 
in which abnormal tissue samples are collected from an isolatable tissue structure, was 
performed with extreme care and was absolutely crucial to the success of these studies. 
Contamination by normal breast epithelial cells or by breast stromal cells would clearly 
n gatively skew the differential screening approach. If the punch biopsy did not cleanly 
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excise DCIS without contamination by other cell types or tissues then the sample was 
not used for mRNA isolation (Jensen et al, Submitted for publication). Figure 3 
contains color photos of DQS {abnormal) tissue, before (upper left panel) and after 
exdsional punch biopsy (upper right panel). The lower panels show tissue samples of 
normal breast tissue (lower left panel), and invasive breast cancer (lower right panel). 

Following microlocation punch harvesting of the frozen tissue, RNA was isolated, 
purified, and employed to construct cDNA libraries. RNA was isolated following 
mincing of tissue in 5.6M guanidinium isothiocyanate and 40% phenol, centrifiigation 
to remove particulate matter, viscosity reduction by repeated aspiration through a 22 
gauge needle, chloroform extraction and ethanol precipitation. In most samples there 
was particulate matter resistant to guanidinium-phenol extraction that was white in color 
and fibrous in appearance and was presumed to represent breast stroma. This stromal 
material was sparse in DCIS sanq>les but abimdant in samples obtained from normal 
breast tissue derived from reduction mammoplasties. The stromal material was minced 
with a tissuemizer, washed with PBS, treated witii collagenase at 37**C for 30 minutes, 
sonicated, extracted with phenol/chloroform and ethanol precipitated. 200 ug of total 
RNA was obtained by pooling 20 punches from normal breast tissue (reduction 
mammoplasty samples) and 5-8 punches from DCIS lesions, presumably reflecting the 
greater cellularity of the DCIS samples. All libraries had greater than 50% inserts and 
contained between 2 X 10^ and 7 X 10'' phage recombinants with an average insert size 
varying between 500 and 1000 base pairs. 

EXAMPLES 

Development of an extraction method which produces br east epithelial RNA 
It was necessary that tissue samples not be contaminated by non-epitheUal stromal 
cells. Such contamination would complicate efforts to compare gene expression 
between samples. In order to test the extent of stromal contamination of the mRNA 
samples, we determined the level of expression of collagen III mRNA by an RNase 
protection assay. RNase protection assays were employed in these and subsequent 
studies because it is a quantitative method and can be performed on small amounts of 
unselected RNA, Collagen m mRNA was identified in the presumed stromal fraction 
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of tiie normal breast tissue and to a lesser extent in the microinvasive breast cancer 
sample, but no expression of collagen m was detected in the DCIS samples which were 
subsequentiy^ employed for cDNA library construction. Figure 4 compares expression 
in NL 2 and #10CA with other patient samples and NLl to determine collagen IH 
expression. 

Expression of CoUagra III mRNA in tissue mRNA samples was analyzed by 
RNase protection assay by methods we have rqx)rted previously (Holt, 1993)* One fig 
of mRNA was hybridized with two labeled RNA probes: a T7 polymerase-generated 
probe for human glyceraldehyde phosphate dehydrogenase (GADP) which protects a 
140 bp Sac I-Xba I fragment; and a T7 polymerase-genoated probe which protects a 
208 bp Hinc n-Pst I fragment from the 3' untranslated region of the human type HI 
procollagm gene (Coll ID) obtained by PGR subcloning of the published sequence (Ala- 
Kokko et al, 1991). RNA samples were labeled as follows: NLl is RNA from cultured 
human breast epithelial cells (Hammond et al, 1984), NL2 is RNA from normal breast 
tissue, NL3 is RNA derived from the fibrous stromal fraction of breast tissue as 
described (Jensen et al. Submitted for publication), NL4 is another sample from normal 
breast tissue. This is described in greater detail on page 30 of this patent zrpplication. 
j!^12,#8,#4,#6, and #10 are from patient samples witii DCIS. Sample #10CA is RNA 
obtained from the small focus of microinvasion shown in Figure 3. Con is a control 
sample using (RNA. 

EXAMPLE 4 

Screening of cDNA libraries 
Following successful testing which demonstrated that stromal contanunation was 
not a problem, cDNA libraries were constructed in lambda phage using polyA-selected 
mRNA from the following samples: cultured human breast epithelial cells, tissue from 
three reduction mammoplasty patients, tissue from three DCIS patients, and tissue from 
ne DCIS patient (patient #10) that showed a small focus of invasion adjacent to an 
area of DCIS. Multiple punches were needed to obtain sufficient RNA for poly A 
selection and library construction. Selective handling of tissue was accomplished. 
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Comparison of grae expression between samples was performed by eithw 
differential screening or a modification of differential display (Liang et al, 1992a; Liang 
et al, 1992b; Saiki et al, 1988; Melton et al, 1984). Plasnud DNA was prepared from 
the cDNA libraries following helpCT phage rescue and screraed by two indepradent 
methods. Figure 5 below shows the results of differential display comparing cDNAs 
of several patient DCIS samples with cDNA obtained from normal breast epithelial 
cells and an early invasive cancer. Although few genes shown in this Figure are 
differentially expressed in the majority of samples with DCIS, the heterogeneity of gene 
expression in patient samples is seen. 

The differential display method (Liang et al, 1992a and 1992b) allows simultaneous 
comparison of multiple tissue samples. Initial studies using this method (reverse 
transcriptase followed by PGR) were unsatisfactory because of unwanted amplification 
of contaminating DNA in tissue samples and the small size of many of the fragments 
identified by display. To circumvent some of these problems, we have attempted to 
combine the advantages of cDNA library screening with the advantages of differential 
display by: 

1) Constructing cDNA libraries from the tissue mRNA samples; 

2) Perfonning differential display on the plasnud DNA prepared from the 
cDNA libraries; 

3) Subcloning the fragments identified by differential display; 

4) Using the subcloned fragment as a probe to clone the cDNA from the 
appropriate library. 

Example 5 

Identification of a pene fRibRedl which is differentially expressed in multiple 
NCDCIS pas^ 

Employing these methods, 10 differentially expressed clones were identified and 
the seven that showed the greatest difference in expression between multiple samples 
were further characterized by DNA sequencing. Comparison of the sequenced clones 
with GwiBank demonstrated that six of the clones are apparently unique sequwices 
(although further DNA sequencing is necessary); but that ne of the clones (here 
termed DCIS-1 and described in Sequence Listing No. 1) showed 90% homology to the 
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previously cloned hamster gene encoding the M2 subunit of ribonucleotide reductase 
(Pavloff et al, 1992; Hurta et al, 1991; Hurta et al, 1991), Altfiough human M2 
ribonucleotide reductase has been cloned previously, comparison of Ae hamster cDNA 
sequence with our done and with the prior human clone indicates that DCIS-1 is 
homologous to an alternatively poly-adenylated form of the human ribonucleotide 
reductase which has not been cloned previously* Figure 6 shows a comparison of the 
sequence between DC3S-1 and the human and hamster genes. 

Because of our concern that different patients may have differratial gene 
expression which is idiosyncratic (or related to morphological differences in biopsy 
g^pearance) and not necessarily related to the induction or progression of DCIS, we 
simultaneously analyzed gene expression in multiple DCIS samples compared to 
multiple control samples. We constructed cDNA libraries from the following samples: 



1) 


Cultured HMEC epithelial cells; 


2) 


Reduction mammoplasty: 11 year old with virginal hyperplasia; 


3) 


Reduction mammoplasty: 28 year old patient; 


4) 


Reduction mammoplasty: 35 year old patient; 


5) 


DCIS patient #12; 


6) 


DCIS patient #8; 


7) 


DCIS patient #10; 


8) 


DCIS patient #10 from an area of invasive cancer adjacent to DCIS; 



In addition to the samples we employed to construct cDNA libraries shown 
above, we also obtained frozen tissue samples from 7 more DCIS patients, 2 cellular 
fibroadenoma samples, and samples of "usual hyperplasia* and atypical hyperplasia. 

Because the DCIS clones were identified by cloning methods which include 
selection and amplification, it was important to confirm by nuclease protection assays 
that the genes were differentially expressed in the original unselected, unamplified RNA 
samples (Figure 7). 

This approach allowed identification of a human gene similar to the hamster RibRed 
gene (coding for the M2 subunit) and 7 other human genes as genes which are 
differentially expressed in a majority of cases of DCIS in human breast tissue. The 
table of differentially expressed genes lists the genes which have been identified as 
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differentially expressed genes in DOS tissue^ samples as compared to that in normal 
tissue (Figure 9). 

EXAMPLE 6 

Methods for studying t> o fftntiai use of differentially expressed genes for diagnPStiP 
screening 

One adyantage of the differential display method is that it allows comparison of 
multiple tissue samples of pre-inyasive or inyasiye breast cancer. For example, use of 
this method has successfully demonstrated that the M2 subunit ribonucleotide reductase 
gene is differentially e3q)ressed in 4 out of 5 pre-inyasiye breast cancer tissue samples. 
It is significant that the M2 subunit is inyolyed in the regulation of the ribonucleotide 
reductase gene and is found to be over-expressed in abnormal tissue samples. 

Identification of differentially expressed genes may lead to discovery of genes 
which are potCTtially useful for breast cancer screening. Of particular interest are 
genes whose expression is restricted to breast epithelial cells and whose gene products 
are secreted. Screening for secreted proteins is possible by using the known 
hydrophobic sequences which encode leader sequences as one primer for differential 
display. The identification of secreted proteins which are specific for early breast pre- 
malignancy (or even eariy invasive cancer) would provide an important tool for early 
breast cancer screening programs. If a differaitially expressed gene has not been 
cloned previously (or if details of its expression are unknown or uncertain) then 
nuclease protection assays or Northern blots can be performed on RNA prepared from 
tissue samples from a variety of tissues to determine if expression of this gene is 
restricted to breast. If necessary cDNA libraries pr^ared from othCT tissues can be 
added to the differmtial display screen as a way to identify only those genes which are 
expressed in early breast cancer and, in addition, are only expressed in breast tissue. 

Once differentially expressed genes have been iiutially characterized for expression 
in pre-malignant and malignant breast disease, antibodies to the protein products of 
potentially useful genes can be developed and employed for immunohistochemistry 
(Harlow et al, 1988). This will provide an additional test to determine whether the 
expression of this gene is restricted to the breast. Subsequentiy, these antibodies will 
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be used to detect the presrace^ this protdn presmt in the blood of patients with pre- 
invasive and/or invasive cancer. By assaying for serum protein levels in the same 
patients who exhibited elevated ©cpression of Ae gene in their tissue samples it will be 
possible to determine whethw a gene product is bdng secreted into the blood. 

EXAMPLE? 

Decreased escprcsston of BRCAl accelerates growth and is observed during breast 
cancer progression 

Breast cancer occurs in hereditary and sporadic forms. Recently the BRCA 1 
gene has been cloned and shown to be mutated in kindreds with hereditary breast and 
ovarian cancer (Hall et al. 1990, Mild, Y. et al. 1994, Friedman et al. 1994, Castilla 
et al. 1994, Simard et al. 1994). Although 92% of families with two or more cases of 
early-onset breast cancer and two cases of ovarian cancer have germ-line mutations in 
BRCA 1 (Naiod et al. in press), the gene has not been shown to be mutated in any 
truly sporadic case to date (Futreal et al. 1994). Despite the surprising paucity of 
somatically acquired mutations in sporadic breast cancer, it is still a likely tumor 
suppressor gene with a key role in breast epithelial cell biology. The BRCA 1 gene 
encodes a protein of 1863 amino acids with a predicted zinc finger domain observed 
in proteins which regulate gene transcription. 

As an initial characterization of the regulation and function of the BRCA 1 gene, 
we analyzed and manipulated expression of BRCA 1 mRNA levels. The results takra 
together indicate that the BRCA 1 gene product is a negative regulator of mammary cell 
proliferation which is expressed at diminished levels in sporadic breast cancer. 
Expression of BRCAl mRNA during breast cancer progression 

As described above, microscopy-directed cloning has been employed to compare 
gene expression in normal mammary epithelium, carcinoma in-situ, and invasive breast 
cancer. This method produces predominanfly epithelial mRNA with minimal 
contamination from stromal elements and we used this approach to obtain mRNA from 
normal neoplastic tissues from patients without a family history of breast cancer. 
Expression of BRCAl exon 24 in human breast tissue samples is shown in Fig. 1. The 
legend of Fig. 1 is as follows. 
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The following tissue samples were used for mRNA isolation: Normal tissue 
samples: NLl-cultured human breast epithelial cells, NL2- Histologically normal breast 
tissue from an 11 year old undergoing a reduction mammoplasty, NL4- histologically 
normal bieast tissue from an 14 year old undergoing a reductim manunoplasty. 
5 Cardnoma-in-situ samples are #6, #8, #10, #12, #23 (comedo type), #41, #55; and 

invasive cancer samples #10CA (invasive cancer from the same patient with cardnoma- 
in-situ), 36CA, ICA. All of these tissue samples were obtained from patiwits who had 
no family history of hereditary breast cancer and RNA preparation was perfonned as 
described above. 

10 PGR detection of BRCAl cxon 24 in cDNA libraries from the following tissue 

samples is described in Figure lOA. Lane 1: human genomic DNA, lane 2: NLl, lane 
3: NL4, lane 4: $8, lane 5: #12, lane 6: #10, lane 7: #10CA, lane 8: #41, lane 9: #23, 
lane 10: 36CA, lane 11: lambda DNA. The arrow points to the expected 113 bp band. 
Nuclease protection assays of microdissected mRNA from tissue samples are 

15 described in Fig. lOB. One ug of mRNA from each tissue sample was hybridized with 

32P-labelled, T7 polymerase-generated KNA probes for BRCAl and human 
glyceraldehyde-3-phosphale dehydrogenase (GAPD) which produce expected protected 
fragments of 113 and 140 respectively as indicated by the lines on the right. Data were 
quantitated by phosphorimaging. The hybridizing intensity of each BRCAl band was 

20 normalized to its respective GAPD band. The normalized values of NLl, NL2, and 

NL4 were intensity in each sample relative to 1. Sample 1 employs human leukocyte 
mRNA; Samples 2-4 are NLl, NL2, and NIA; Samples 5-9 are #6(2.8), 8(3.7), 
10(2.8), 12 (5.9), and 55 (1,4); and 10-12 are #10CA (0.07), 36CA (0.13), and ICA 
(0.2). 

25 Fig, 10 shows that BRCAl exon 24 mRNA is expressed at 5-10 fold higher 

levels in normal mammary tissue than in invasive breast cancer samples. Initial studies 
showed detectable levels of BRCAl cDNA in a cDNA library prepared from a tissue 
sample with preinvasive carcinoma-in-situ but not in normal breast cancer invasive 
breast cancer cDNA libraries (Figure lOA). Because this method is relatively 

30 insensitive we directly quantitated BRCAl mRNA by nuclease protection assays in 

RNA samples obtained by our microdissection method described above. These assays 
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indicate that wpression of BRCAl mRNA in micro-dissected normal mammary 
epithelial tissue (lanes 2-4, Figure lOB) is 5-15 fold higher than that in breast cancer 
(lanes 10-12, Figure lOB). The highest levels of BRCAl are observed in samples from 
non-comedo ductal carcinoma-in-situ (lanes 5-9, Figure lOB), a premalignant breast 
lesion with a finite, but relatively low rate of progression to invasion (Betsill et at., 
1978, Page, D.L. et al., 1982, Page and Dupont, 1990). 

Because these studies suggested that invasive breast cancer exhibited lower 
mRNA levels than normal breast epithelial cells, we compared expression of paired 
samples of normal breast and invasive cancer from the same patient (Figure UA; 
compare lanes 2 and 3, 4 and 5, 6 and 7). The legend of Fig. 11 is as follows. 

Nuclease protection assays of RNA obtained from paired samples of invasive 
breast cancer and histologically normal breast tissue are shown in Fig. 11 A. Samples 
in lanes 2 and 3 (fmi patient), 4 and 5 (second patient), 6 and 7 (third patient) are from 
invasive cancer and normal breast tissue respectively. Lane 1 is NLl mRNA as 
described in legend to Fig. 10 and lane 8 is human leukocyte mRNA. Ratios of 
BRCAl/GAPD for each sample: lane 1: 25.9, lane 2: 1.8, lane 3: 7.6, lane 4: 2.0, 
lane 5: 12.4, lane 6: 0.7, lane 7: 6.0. The probes and methods are as described in Fig. 
10 except the GAPD probe was of lower specific activity to improve quantitation. 

Nuclease protection assays of RNA fix)m a series of invasive breast cancer tissue 
samples (lanes 2-9 compared with NLl (lane 1) and leukocyte mRNA (lane 10) are 
shown in Fig. IIB. Ratios of BRCAl/GAPD for each sample: lane 1: 19.1, lane 2: 
0.3, lane 3: 1.8, lane 4: 1.6, lane 5: 0.2, lane 6: 0.3, lane 7: 1.9, lane 8: 0, lane 9: 
0.6. 

Although the samples were paired in Fig. 11 A, they were not microdissectcd 
so this approach overestimates the relative expression level of invasive samples because 
they have a greater percentage of epithelial cells. RNA levels were four to eight fold 
higher in samples derived from normal breast than in samples derived from invasive 
breast cancer. We next analyzed expression levels in 8 non-hereditary invasive cancer 
samples (Figure IIB: lanes 2-7), Although these samples showed some variability in 
expression level, all had lower levels of BRCAl mRNA (determined by ratio of 
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BRCAl to GAPD) than the primaiy^%east q>ithelial cell line or the nonnal breast 
samples shown in Figure 1 1 A. 

Effects of BRCAl gene inhibitioii on proliferative rate and gene expression 

Having demonstrated that mRNA expression levels of BRCAl are higher in 
5 normal mammary cells than in cancer cells, we used antisense methods to test the 

hypothesis that BRCAl expression inhibits cell growth. Unmodified 18 base 
deoxyribonucleotide complementary to the BRCAl translation initiation site were 
synthesized and added to cultures of primary mammary epithelial cells (Stampfer et al. 
1980) or MCF-7 breast cancer cells (Soule and McGrath, 1980). Figure 12 is graph 
10 showing growth rate of human primary mammary q>ithelial cells (A), MCF-7 cells (B), 

retinal pigmented epithelial cells (C), cultured as described below. Points and bars 
represent the mean and the 95% confidence interval of triplicate counts of cells 
incubated with a single bolus of the indicated concentration of antisense or control sense 
deoxyribonucleotide. 

15 The morphologic appearance of the cell lines was not noticeably changed by 

addition of antisense oligonucleotide, but the proliferative rate was faster. Incubation 
of cells with 40 uM anti-BRCAl oligonucleotide produced accelerated growth of both 
nonnal (Figure 12A) and malignant mammary ceUs (Figure 12B), but did not affect the 
growth of human retinal pigmented epithelial cells (Figure 12C). An intermediate dose 

20 of anti-BRCAl oligonucleotide produced a less pronounced but significant increase in 

cell growth rate. This was not a toxic effect of the oligonucleotide since a control 
"sense" oligomer with the same GC content did not increase the proliferation rate, and 
because an addition of a 10 fold excess of sense oligomer to the anti-BRCAl oligomer 
reversed the growth activation. 

25 In order to critically evaluate the function of BRCAl gene inhibition on growth 

stimulation and cell cycle progression it was necessary to identify a gene whose 
expression is cell cycle regulated in human mammary cells. The gene encoding the M2 
subunit of ribonucleotide reductase is amplified in conditions of nucleotide starvation 
(Hurta and Wright 1992) and as shown above, exhibits elevated levels of expression in 

30 premalignant breast disease. Because ribonucleotide reductase constitutes the rate 

limiting step in DNA synthesis, we reasoned that it might be cell cycle regulated in a 
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synchronous growth modd such as MCF-7 cells which can be growth arrested by 
tamoxifen and then restimulated by estrogen (Aitken et al. 1985, Arteaga et al. 1989). 
MCF-7 cells were growth arrested by tamoxifen for 48 hours and then stimulated at 
time zero (0) with luM estradiol (+E) or control vehicle (-E). Inhibition of DNA 
syntiiesis by tamoxifen and induction of synthesis by estrogen were confirmed by 
nuclear labelling studies with tritiated thymidine. 

Fig. 13 panels A and B show that transcription of the ribonucleotide reductase 
M2 g«ie is cell cycle regulated, inhibited by tamoxifen, and induced by estrogen. Fig. 
13A is a Northern blot of mRNA from synchronized MCF-7 cells. At the indicated 
time in hours, total cellular RNA was isolated and Northern blotting performed using 
the 1.6 Kb Eco RI fragm«»t from our cloned human ribonucleotide reductase cDHA 
described above. Two mRNA species of 1.6 and 3.4 K) are obs»ved in these studies. 

Fig. 13B shows nuclear runon studies of synchronized MCF-7 cells were 
performed by our published methods (Holt et al 1988) employing the 1.6 Kb fragment 
of ribonucleotide reductase described above (RR); the 1.8 Kb fragment of 
Topoisomerase H (Topo) described in the Olsen et al. 1993); the 1.0 Kb cyclophilin 
gene (Thompson et al. 1994) used as a constitutive control; and 18S ribosomal RNA 
(Thompson et al. 1994). C<m represents cdls which wctb grown for 48 hours but not 
treated with tamoxifen. 

AntisOTse inhibition is a useful strategy for studying gene expression which is 
depoident on ejq>tession of the antisense target gene (Robinson-Benion and Holt, in 
press, 1995), e.g. genes whose expression is direcfly or indirectly depeadeat on 
BRCAl levels. Fig. 14 demonstrates that antisense inhibition of BRCAl results in a 
corresponding increased expression of M2 ribonucleotide reductase mRNA. A nuclease 
protection assay of mRNA draived from primary mammary epithelial cells (lanes 1-4, 
9-10) or MCF-7 cells (lanes 5-8, 11-12) cultured for 4 days witii antisense or control 
oligonucleotide was performed under the following conditions: no oligonucleotide (lanes 
1 and 5); 40uM antiBRCAl (lanes 2,6,10,12); 4uM antiBRCAl (lanes 3 and 7); 40uM 
sense control (lanes 4,8,9,11). Probes for BRCAl and GAPD are as described for 
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Figure 10, and the libohucleotide reductase M2 probe (RR) detectis' &ie 200 bp probe 
is described above. 

Ribonucleotide reductase mRNA levels are highest in samples treated with 40 
uM anti-BRCAl oligonucleotide for both primary mammary epithelial cells and for 
MCF-7 cells (Fig. 14). Antisaise inhibition of BRCAl results in a 70-909& inhibition 
of mRNA levels in anti-BRCAl treated cells compared with cells treated with the 
"s«se" control oligonucleotide (compare lanes 9 and 10, Fig. 14). Note that MCF-7 
cells have lower levels of BRCAl than the normal mammary epithelial cells (compare 
lanes 1 and 5, Fig. 14) anti-BRCA 1 since the antisense inhibition may drop BRCAl 
levels bdow a critical threshold which normally functions to inhibit growth. 
Methodology 

Tissue samples. Freshly obtained breast biopsy or reduction mammoplasty 
specimens were frozen and then RNA was obtained following the microdissection 
method described above. Lesions were selected which were microlocalized and 
homogenous so that pure lemons could be obtained by 2 mm punches. Samples which 
had admixed normal epithelial, carcinoma-in-situ, or invasive cancer were not used for 
this study. Family history was obtained by chart review and/or interview to exclude 
familial breast cancer cases. 

Nuclease Protection Assays. PCR primers were derived from BRCAl 
sequence in GenBank (Accession number U14680); forward 5' 
CAATTGGGCAGATGTGT 3' and reverse 5* CTGGGGGATCTGGGGTATCA 3' 
which amplify a 113 bp region from exon 24, corresponding to bases 5587 to 5699 of 
the human BRCAL This region was selected because this exon has not been reported 
to be differentially spUoed unlike more 5' exons. The BRCAl probe was cloned by 
subcloning this 113 bp band from normal human genomic DNA into PCRscriptSK and 
screening for correct orientation. One ug of mRNA from each tissue sample was 
hybridized with 32P-labelled, T7 polymerase-generated RNA probes for BRCAl and 
human glyceraldehyde-3-phosphate dehydrogenase (GAD?) which would produce 
expected protected fragments of 113 and 140 respectively. The construction and use 
of the GADP probe for RNA standardization has been described above. The probe for 



"WO 95/19369 



PCT/DS95/00608 



43 

ribonucleotide reductase M2 mRNA is the same as above and detects a 200 bp 
protected fragment. 

Antisense oligonucleotide studies. Unmodified deoxyiibonucleotide were 
analyzed by gel electrophoresis and UV shadowing and shown to be homogenous and 
of ^>propriate size. These oligonucleotide were purified by multiple lyophilization and 
solubilized in buffered media as described (Holt et al. 1988). Sequence of the 
unmodified antsBRCAl oligonucleotide 5' AAGAGCAGATAAATCCAT 3* and the 
complementary sense oligonucleotide 5' ATGGATTTATCTGCTCTT 3' correspond to 
the presumed translation initiation site at bases 12-137 of the GenBank sequence. The 
antisCTse oligonucleotide sequmce was searched against Gmbank and no significant 
homologies were identified to genes except BRCAL Oligonucleotides wctc used 
according to our published methods (Holt et al. 1988). Primary manmiary epithelial 
cells were cultured in serum-free medium supplemented with epidermal growth factor, 
insulin, hydrocortisone, ethanolamine, phosphorylethanolamine, and bovine pituitary 
extract. MCF-7 cells were cultured in Minimum Essential Medium Eagle (Modified) 
with Earle's salts and 2g/L sodium bicarbonate m supplemented with 2mM 
glutamine, GMS-A (Gibco Cat. #680-13(X)AD), nonessential amino acids, and 2.5% 
fetal calf serum. Retinal pigmented perithelial cells were cultured in DMEM and 10% 
calf serum. 

Our results indicate that the BRCAl gene is expressed at higher levels in normal 
mammary cells than in breast cancer cells and that diminished expression of BRCAl 
increased the prolif^ative rate of breast cells. This correlates well with the recent 
finding that patirats with BRCAl gene-linked hereditary breast cancer have tumors that 
grow more rapidly than comparable sporadic tumors (Marcus, J. et al. 1994). The 
decreased mRNA levels which were observed in sporadic breast cancers are not a 
consequence of differential splicing of the gene since the RNAs were quantitated with 
probes from the 3' end of the mRNA which is not a region where differential splicing 
is reported to occur (Mild, Y. et al 1994). Invasive sporadic cancers have BRCAl 
mRNA levels which vary from 0 (in one case) to 20% of the levels observed in normal 
human mammary epithelium. 
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Examples 8 and 9 describe igipKcations of the discovery of the function of the 
BRCAl gene. Example 8 describes a gene therapy method and example 9 describes 
a drug screening method. The discovery of the diminished expression of the BRCAl 
mRNA in breast cancer using the microdissection techniques of this invention provides 
an important scientific basis for these examples. 
Example 8 

Gme Therapy method based on determination of the function of the BRCAl Gene 

Viral vectors containing a DNA sequence that codes for a protein having an 
amino acid sequence as essentially set forth in SEQ ID NO:49 can be constructed using 
techniques tiiat are well known in the art* This sequence includes the BRCAl gme 
product. Viral vectors containing a DNA sequence essentially as set forth in SEQ ID 
NO:47 (the BRCAl goie) can be also constructed using techniques that are well known 
in the art. Retroviral vectors, adenoviral vectors, or adeno-assodated viral vectors are 
all useful methods for delivering genes into breast cancer cells. An excellent candidate 
for use in breast cancer gme therapy is a Moloney-based retroviral vector with a breast 
selective MMTV promoter which we have rqwrted previously (Wong et al). The viral 
vector is constructed by cloning the DNA sequence essentially as set forth in SEQ 
ID:47 into a retroviral vector such as a breast selective vector. Most preferably, the 
fiiU-length (coding region) cDNA for BRCAl is cloned into the retroviral vector. The 
retroviral vector would then be transfected into virus producing cells in the following 
mannen Viruses are prepared by transfecting PAS 17 cells with retroviral vector DNAs 
which were purified as described in Wong et al. Following transfection, the PA317 
cells are split and then treated with G418 until individual clones can be identified and 
expanded. Each clone is then screened for its titer by analyzing its ability to transfer 
G418 resistance (since the retroviral vector contains a Neomycin resistance gene). The 
clones which have the highest titer are then frozen in numerous aliquots and tested for 
sterility, presence of replication-competent retrovirus, and presence of mycoplasm. The 
methods generally employed for construction and production of retroviral vectors have 
been described in Muller, 1990. 

Once high titer viral vector producing clones have been identified, then patients 
with breast cancer can be treated by the following protocol: Viral vector expressing 
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BRCAl is infused into either solid tumors or infused into malignant effusions as a 
means for altering the growth of the tumor (since it is shown above that the BRCAl 
gene product decreases the growth rate of breast cancer cells). Because viral vectors 
can efficiently transducer high percentage of cancer cells, the tumors would be growth 
inhibited. 
Example 9 

Method of Screening Compounds Capable of Activating Promoter Region of the 
BRCAl Gene 

The discovery of the function of the BRCAl gene provides a clear utility in that 
induction of expression of the gaie and the resulting increase in level of protein 
encoded by the gene in the breast cancer cell ^ould slow the proliferation of the breast 
cancer cells. Induction of expression of the gene can be caused by administering a 
compound to a patient that stimulates the regulatory regions of this gene, such as the 
promote. 

A method for screening compounds that activate the promoter of the BRCAl 
gene is designed in the following way. A promoter sequence is a DNA segment that 
upregulates the repression of a gene. A sequence essentially as set forth in SEQ ID 
NO:48 can be ligated into a suitable vector, such as a plasmid, that contains a reporter 
gMie using standard recombinant DNA techniques of restriction enzyme digests, ligation 
of fragment into vector, and transformation of bacteria. SEQ ID NO:48 includes the 
promoter sequence of the BRCAl gene* A reporter gene is a gene that produces a 
readily detectable product. Examples of appropriate reporter genes which could be 
employed for this purpose include Beta-galactosidase or the chloramphenicol 
acetyltransferase gene. 

The BRCAl promoter/reporter gene combination can then be cloned into an 
expression vector or viral vector by standard recombinant DNA methods. Breast 
cancer cells can then be transfected with the expression vector containing the BRCAl 
promoter/reporter gene using standard transfection methods which we have reported 
previously (Holt et al. PNAS 1986). A stable transformant witii appropriate low level 
expression (breast cancer cells have low level BRCAl expression as shown above) will 
be identified and then characterized to demonstrate proper DNA integration and 



WOM/19369 



PCT/US95/00608 



46 

^ression. Methods of establishing and characterizing stable transformants have been 
described (Holt. MCB, 1994). Once an appropriate stable transformant cell line is 
identified, then we can plate the cell line in a manner than permits screening of 
hundreds or thousands of drugs or biological agmts (for ^cample in multiple 96 well 
microtiter plates). Level of e3q)ression of the reporter gene can be quantitated and 
agents which activate expression are thus identified. A positive result (i.e. induction 
of the promoter region) results in increased levels of the reporter gene resulting in 
either an increase in color (Beta-galactosidase assay) or specific radioactivity 
(Chloramphenicol acetyltransfCTase activity) through a reaction between the protein 
mcoded by the reporter g&tt and a compound in the reaction medium. The compound 
produced by the reaction between the rq>orter gene protein and the compound in the 
reaction medium is the cause of the increase in color or specific radioactivity. Hiese 
compounds can be called indicator compounds in that their presence indicates that the 
drug or biologial agent activitated the BRCAl promoter. Methods for standardizing 
and performing Beta-galactosidase or chloramphenicol acetyltransferase assays have 
been reported (Holt et. al. MCB 1994). This method would be useful for initial 
screening of agents which increase BRCAl expression. These agents could then be 
tested in more rigorous assays of breast cancer growth such as nude mouse tumor 
assays (Arteaga et al). This approach allows mass screening of large numbers of 
agents, sparing more rigorous animal tests for only promising compounds which score 
in the reporter gene assay described herein. 

Thus, although there have been described particular embodiments of the present 
invention of a new and useful "Method for Detection and Treatment of Breast Cancer**, 
it is not intended that such embodiments be construed as limitations upon the scope of 
this invention except as set forth in the following claims. It will be apparent to those 
skilled in the art that many changes and modifications may be made without departing 
from the invention in its broader aspects. For example, the above described techniques 
may be used in the diagnosis of other diseases and detection of differential gmetic 
expression from microscopically-directed tissue samples of pathologic tissue. The 
production of a cDNA library produced as a result of the differential expression of 
genes in pathologic tissue in comparison to normal tissue provides the opportunity for 
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further adiagnostic capabilities. Furth», although thwe have been described certain 
experimental conditions used in the preferred embodimrat, it is not intended that such 
conditions be construed as limitations upon the scope of this invention except as set 

forth in the claims. 

The foUowing references are included to proWde details of sciratifictecta^ 

herein incorporated by reference to the extent tfiat tiiey provide additional information 

for the purposes of indicating the background of the invention or illustrating the state 

of the art. 
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ADDITIONAL DESCRIPTION OF THE HGURES 

Figure 2: Model for premalignant conditions, highlighting magnitude of risk 
for progression to clinical malignancy. Terms from human breast neoplasia are used: 
no proliferative disease (No Pro), proliferative disease without alypia (PDWA), typical 
hyperplasia (AH), carcinoma in situ (CIS). As is proposal of tumor progression each 
stage is more likely to proceed to the next (dotted lines), but could also remain stable 
(horizontal lines, probably fairly frequCTt), or directly proceed to devdop a clone of 
cells with malignant bdiavior (vertical lines, becoming more likely further to right.) 

Figure 5: Differential display of cDNAs obtained from patient tissue samples 
and controls. Rescued cDNA library samples were used as templates for low 
stringency PGR with the primers S'GATGAGTTCGTGTCCGTACAACTGGS' and 5' 
GGTTATCGAAATCAGCCACAGCGCC3'; 40 cycles were performed at conditions 
described above. Samples (See legend to Figure 4): Lane 1 - #12; Lanes 2 and 3: 
separate phage rescues of NLl to show reproducibility of the assay; Lane 4 - #8; Lane 
5 - #10; Lane 6 - #10CA; Lane 7 • control from the rescued phage vector without 
cDNA inserts. Arrows mark cDNAs which are overexpressed in DCIS versus normal. 
Arrowheads mark cDNAs which are differentially expressed in the invasive cancer 
(note this may reflect contamination from stromal cells). The bar marks a cDNA which 
is expressed in normal breast cells at higher levels than in DCIS or invasive cancer. 

Figure 7: Expression of DCIS-1 mRNA in tissue mRNA samples analyzed by 
RNase protection assay. Probes: GADH probe and DCIS-1 clone probe which was 
generated by linearizing the rescued plasmid with Pvu n and should generate a 200 bp 
protected fragment. RNA samples were labeled as in the legend to Figure 4. 
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SEQUENCE LISTINGS 
(1) GENERAL INFORMATION: 
APPUCANT: HOLT, JEFFREY T. 
JENSEN, ROY A. 
PAGE, DAVID L. 
OBERNOLLER, PATEUCE S. 
ROBINSON-BENION, CHERYL L. 
THOMPSON, MARILYN E. 

TTTLE OF INVENTION: MEIHOD FOR DETECTION AND 
TREATMENTS OF BREAST CANCER 
NUMBER OF SEQUENCES: 49 
CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: I.C. WADDEY, JR. 

(B) STREET: 27TH FLOOR, L & C TOWER, 401 CHURCH 

(C) CITY: NASHVILLE 

(D) STATE: TENNESSEE 

(E) COUNTRY: USA 

(F) ZIP: 37219 
COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 800 kB storage 

(B) COMPUTER: IBM PC/XT/ AT compatible 

(C) OPERATING SYSTEM: MS-DOS (version 5.0) 

(D) SOFTWARE: WordPerfect 5.1/WordPerfect Editor 
CURRENT APPUCATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
PRIOR APPUCATION DATA: 

(A) APPLICATION NUMBER: U.S. 08/182,961 

(B) FILING DATE: 14 JAN 1994 
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(viii) ATTORNEY/AGENT INFORl^HON: 

(A) NAME: LC. WADDEY, JR. 

(B) REGISTRATION NUMBER: 25,180 

(C) REFERENGE/DOCKET NUMBER: 0216-9409 

(ix) TELECOMMUNICATION INFORMATION (O): 

(A) TELEPHONE: (615) 242-2400 

(B) TELEFAX: (615) 242-2221 
(Q TELEX: 

(2) INFORMATION FOR SEQ ID NO: 1 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 

(B) TYPE: nucleic add 

(C) STOANDEDNESS: double 

(D) TOPOLOGY: linear 

Cn) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

0v) ANTI-SENSE: no 

(v) ORIGINAL SOURCE 

(A) ORGANISM: Homo sapiens s^i«is 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal carcinoma in situ 

(H) CELL LINE: not derived from a cell line 
Q) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) UBRARY: cDNA library derived from human 

(B) CLONE: obtained from identification of differential 
gene expression 
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(vin) POSmON IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 
fix) FEATURE: 

(A) NAME/KEY: DCIS-1 

(B) LOCATION: GenBank accession no. L2736 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differential display 

(D) OTHER INFORMATION: gene mcoding M2 subunit of 

humanribonucleotide reductase 
(X) PUBUCATION INFORMATION: unpublished 

(K) RELEVANT RESIDUES IN SEQ ID NO: 1 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

nCGGAATTG GGTACGCGGG CCCCCCACTC TGCCGAATTC CTGCATGCGG GGGATCCACT 60 

AGTTCAGAGC AGGCCCCCAC CCG7AGCACT CCAGCnTTG nCCTTCCCT TTAG7GAGGG 120 

TTAATTTTCG AGCTTGGCGT AATCATGGTC ATAGCTGTrr CCTGTGTGAA ATTGTTATCC 180 

OrrCACAATT CCACACAACA TACGACCCGC AAGCATAAAA GT6TAAAGCC TCCCCTGCCT 240 

AATGAGTGAG CTAACTCACA TTAA 264 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(u) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(V) ORIGINAL SOURCE 

ORGANISM: Homo sapiens sapiens 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 
(F) TISSUE TYPE: female breast 
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(G) CELL TYPE: ductal carcinoma in situ 

(H) CELL LINE: not derived from a cell line 
(D ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library derived from human 

(B) CLONE: obtained from identification of diff««itial gene 

expression 

(viii) POSmON IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-2 

(B) LOCATION: GenBank accession no. L27637 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differmtia] display 

(X) PUBLICATION INFORMATION: unpubUshed 

(K) RELEVANT RESIDUES IN SEQ ID NO: 2 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TAGCCCCCTT ATCGAAATAC CCACAGCCCC TaTCACTAT CAGCAGTACG CCGCCCAGTT 60 
6TACGGACAC GGA 73 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 

(B) TYPE: nucldc acid 

(Q STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(V) ORIGINAL SOURCE 

(A) ORGANISM: Homo sapiens sapiens 
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(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal carcinoma in atu 

(H) PFT T. LINE: not derived from a cdl line 

(I) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library doived from human 

(B) CLONE: obtained from idaitification of differential gene 

expression 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 
(Q UNITS: imknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-3 

(B) LOCATION: L27638 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differential di^lay 

(x) PUBLICATION INFORMATION: unpublished 
(K) RELEVANT RESIDUES IN SEQ ID NO: 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TGCCCGATGT GTGTCGTACA ACT66CGCTG TCCCTGATTT CGATAA 46 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(V) ORIGiNAiL SOURCE 

(A) ORGANISM: Homo sapiois sapiens 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal cardnoma in situ 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 
(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library derived from human 

(B) CLONE: obtained from identification of differential gene 

expression 

(viu) POSmON IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-4 

(B) LOCATION: L27640 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differraitial display 

(x) PUBLICATION INFORMATION: unpublished 
(K) RELEVANT RESIDUES IN SEQ ID NO: 4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TAGCCCATCA CTTCGTGTCC GTACAACTG6 CeceCTeTGG CTGATTTCCA TANNNNMACC 60 
ATCAGCCCGA CG 72 

(2) INFORMATION FOR SEQ ID NO:5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: double 

(D) TOPOLOGY: linear 



W09S/19369 



PCT/DS95«0608 



60 

fii) MOLECULE TYPE: cDNA to mRNA 

Ciii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(V) ORIGINAL SOURCE 

(A) ORGANISM: Homo sq>iens sapiens 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal carcinoma in situ 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library derived from human 

(B) CLONE: obtained from identification of differential gene 

expression 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSmON: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-5 

(B) LOCATION: L27641 

(Q IDENTIFICATION METHOD: microscopically-directed 

sampling and differential display 
(X) PUBLICATION INFORMATION: unpubUshed 

(K) RELEVANT RESIDUES IN SEQ ID NO: 5 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TAGCCCGGTT ATCGAAATCA CCCACAGCGC CTAACTTCTG CAGAAGCCTT TGACCATCAC 60 
CAGTTGTACG GACACGAACT CATC 84 

(2) INFORMATION FOR SEQ ID NO:6: 
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(i) SEQUENCE CHARACTEWSTICS: 

(A) LENGTH: 99 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to inRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(v) ORIGINAL SOURCE 

(A) ORGANISM: Homo sapiens sapiois 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal carcinoma in situ 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) UBRARY: cDNA library derived from human 

(B) CLONE: obtained from identification of differential gene 

expression 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSmON: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-6 

(B) LOCATION: L27642 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differential display 

(x) PUBLICATION INFORMATION: unpublished 
(K) RELEVANT RESIDUES IN SEQ ID NO: 6 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTG6TTTCCC AAATTCCT6C GAA6GGGGGT GCTGCCGTGT GGAATTGTCC CGCCCCCTGC 60 
TCT6CCGCCG CGTTTTTTGT CTACATTCGT CGTAGCTCG 99 

(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(V) ORIGINAL SOURCE 

(A) ORGANISM: Homo sapiens sapiens 

(C) INDIVIDUAL/ISOLATE: sample of non-comedo DCIS 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: ductal carcinoma in situ 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 
(vu) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library derived from human 

(B) CLONE: obtained rom identification of differential geae 

expression 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: DCIS-7 

(B) LOCATION: L27643 
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(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and differential display 
(x) PUBUCATION INFORMATION: unpublished 

(K) RELEVANT RESIDUES IN SEQ ID NO: 7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATCAGCGCGC 6ACATTCGGG TACCCGCCCC CCCCCCTCCG TC6GAATTCC TC6AGCCGGC 60 
' ATCCATAGGA TGTGGAGTTA GTTTTGTT 88 

(2) INFORMATION FOR SEQ ID NO: 8 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

CGCGACGGCC GCGC6TCTGC CAGGG 25 

(2) INFORMATION FOR SEQ ID NO:9 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) • TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 
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(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CGCCCCTGC6 TTACCCTCCC CGCCG 25 

O) INFORMATION FOR SEQ ID NO: 10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIFnON: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(V) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGATCGCGTC CTGTAACCCC ACCXT 25 

(2) INFORMATION FOR SEQ ID NO: 1 1 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(u) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ACTGGGCTGT CCTGCGGT6C CGGCG 25 



(2) INFORMATION FOR SEQ ID NO: 12 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PGR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonudeotide 

(xi) SEQUENCE DESCRIPTIGN: SEQ ID NO: 12: 

CTCAGAGGTA GCCGCGCCGA GGCTG 25 

(2) INFORMATION FOR SEQ ID NO: 13 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nudeic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonudeotide 

(xi) SEQUENCE DESCRIPnON: SEQ ID NO: 13: 

GCCTGGCCGC 6ACACGGATT ACCCC 25 

(2) INFORMATION FOR SEQ ID NO: 14 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nudeic add 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PGR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTAGCGCATG GTG6ACCTGG AGACG 25 

(2) INFORMATION FOR SEQ ID NO: 15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
(iu) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TGTGGTTACG TCAGCGAAGG TAATA 25 

(2) INFORMATION FOR SEQ ID NO: 16 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STOANDEDNESS: single 

(D) TOPOLOGY: Unear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
(iu) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 
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(v) FEIAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

A6TCGCACGC ATCTCACGCT CCGCC 25 

(2) INFORMATION FOR SEQ ID NO: 17 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TATCCAAGCG CCAG6CTACG A6GCC 25 

(2) INFORMATION FOR SEQ ID NO: 18 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuclac acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHEnCAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GGCGCGCCCG ACGGTCTGGT ATCTA 25 



(2) INFORMATION FOR SEQ ID NO: 19 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(V) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

CTCCCTCCCC GMCTCCeCG TTAGT 25 

(2) INFORMATION FOR SEQ ID NO:20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

ATGCCCCCGG CTCGGGCCTG GTCCC 25 

(2) INFORMATION FOR SEQ ID NO:21 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PGR primer 
(ill) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CGTGAAGCCT ATGCCCTCCC TCAAC 25 

(2) INFORMATION FOR SEQ ID NO:22 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

GTGCCGTCGT AGCCCTTCAG CGATC 25 

(2) INFORMATION FOR SEQ ID NO:23 

(i) SEQUENCE CHARACTERISTICS: 

(A) LHSfGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
(iu) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: obgonucleotide 
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(xi) SEQUENCE DESCRIPnON: SEQ ID NO: 23: 

GCGACAaAG GCTCCCGGAG GAGG6 25 

(2) INFORMATION FOR SEQ ID NO:24 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuddc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTEffiTICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

TGGGCCAGGC CTCCGGCCCC GGTAT 25 

(2) INFORMATION FOR SEQ ID NO:25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

CCGGAACTGC GATAGCGTCC CTCCC 25 

(2) INFORMATION FOR SEQ ID NO:26 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nudeic acid 

(C) STRANDEDNESS: single 

(D) TOPOLXXjY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

AGCGCACACC TCTTTCCCGA GAGCC 25 

(2) INFORMATION FOR SEQ ID NO:27 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) T5fPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(u) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
Ciii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AACGGGTGGA CATCCGCCTG CCGCC 25 

(2) INFORMATION FOR SEQ ID NO:28 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

. (A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANn-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPnON: SEQ ID NO: 28: 

TGAACCAC6A TGTCAATCGT CCCGA 25 ^ 

(2) INFORMATION FOR SEQ ID NO:29 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(u) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPnON: SEQ ID NO: 29: 

TCATCCCCGC CGAAAGACGC TCGCC 2S 

(2) INFORMATION FOR SEQ ID NO:30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 



W095/1936!> 



PCTAJS95/00608 



73 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

ATACGCTGCG GCACGC6CTC GGACT 25 

(2) INFORMATION FOR SEQ ID NO:31 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuddc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GACCAGGTGC GCACGAGCAT GTACA 2S 

(2) INFORMATION FOR SEQ ID NO:32 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

AGCGTAGTCA TCG6CCTTCC C6CCC 25 

(2) INFORMATION FOR SEQ ID NO:33 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic add 

(Q STRANDEDNESS: single 

(D) TOPOLCX3Y: linear 
Oi) MOLECULE TYPE: DNA 

(A) DESCRIPTEON: PCR primer 
Ciii) HYPOTHEnCAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GGCCCCTAGC CCAGGGTGAA CCCCA 25 

(2) INFORMATION FOR SEQ ID NO:34 
CO SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
Cii) MOLECULE TYPE: DNA 

(A) DESCRIPnON: PCR primer 
Ciii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CCCAGTGCTA CGGGCCGCCC CAAGC 25 

(2) INFORMATION FOR SEQ ID NO:35 
Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PGR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CCTTCCTGGG TTACCTCCCC TCGGG 25 

(2) INFORMATION FOR SEQ ID NO:36 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
(in) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

TCCGGACAGC AGCCACGCCA AGGGC 25 

(2) INFORMATION FOR SEQ ID NO:37 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
Ciii) HYPOTHETICAL: yes 

Civ) ANTI-SENSE: no 
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(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

ACGC6CTGCT CCACCGA66C CTGfcT » 

(2) INFORMATION FOR SEQ ID NO:38 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuddc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primw 
fiii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CGATGCAAGG CCAGCAGCAC TCGAC 25 

(2) INFORMATION FOR SEQ ID NO:39 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuclac add 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 
(u) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCCCGGAQC 6GACCACCGG ACGTC 25 

(2) INFORMATION FOR SEQ ID NO:40 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nuddc acid 

(C) STEIANDEDNESS: single 
(P) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

AGCCGGGAGG GATCGGGGGC CMGC 25 

(2) INFORMATION FOR SEQ ID NO:4I 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GCCT6GTGTA GGCAGGCAGC TCTTA 25 

(2) INFORMATION FOR SEQ ID NO:42 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PGR primer 
HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonudeotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



CCACCCCTGT AGTCCGGGCT GCGAG 25 

(2) INFORMATION FOR SEQ ID NO:43 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucl^c acid 

(C) STRANDEDNESS: single 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



CGAACCCGAC GCCCGTCCAG 6GTTC 25 

(2) INFORMATION FOR SEQ ID NO:44 

(i) SEQUENCE CHARACTERISTICS: 

(A) UENGTH: 25 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHEnCAL: yes 
Civ) ANTI-SENSE: no 
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(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

TCGGGCAGCA A6CCCGG&AC 6CTCC 25 

(2) INFORMATION FOR SEQ ID NO:45 
0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
Cii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 
Ciii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GACGGG6GAC GGCCTAC6T6 GCTTA 25 

(2) INFORMATION FOR SEQ ID NO:46 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 

(B) TYPE: nucl«c acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(A) DESCRIPTION: PCR primer 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

(v) FRAGMENT TYPE: oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

CTTCrrcCCG CCGGAGAGGC CTGCC 25 

(2) INFORMATION FOR SEQ ID NO:47: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) I^NGTH: 5712 

(B) TYPE: nucldc acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to niRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(v) ORIGINAL SOURCE 

(A) ORGANISM: Homo s^iens sapiens 

(Q INDIVIDUAL/ISOLATE: 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast . 

(G) CELL TYPE: ductal carcinoma in situ, invasive breast cancer 
and normal breast tissue 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA library derived from human 

(B) CLONE: obtained using published sequence 

(viii) POSmON IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 
Ox) FEATURE: 

(A) NAME/KEY: BRCAl 

(B) LOCATION: GenBank accession no. U14680 

(C) IDENTIFICATION METHOD: microscopically-directed 
sampling and nuclease protect! n assay 

(D) OTHER INFORMATION: gene encoding BRCAl protein 
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(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Mild, Y., et. al. 

(B) TITLE: A strong candidate gene for the breast and ovarian 

cancer susceptibility gene BRCAl. 

(C) JOURNAL: Science 

(D) VOLUME: 266 

(E) PAGES: 66-71 

(F) DATE: 1994 

(K) RELEVANT RESIDUES IN SEQ ID NO: 47 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

agctcgctga gacttcctgg accccgcacc aggctgtggg gtttctcaga taactgggcc 



167 



215 



60 

cctgcgctca ggaggccttc accctctgct etgggtaaag ttcattggaa cagaaagaa 119 
atg gat tta tct get ctt cgc gtt gaa gaa gta caa aat gtc att aat 
Met Asp Leu Ser Ala Leu Arg Val Glu GLu Val Gin Asn Vat tie Asn 
15 10 15 

get atg cag aaa ate tta gag tgt ecc ate tgt ctg gag ttg ate aag 
Ala Met Gin Lys lie Leu Glu Cys Pro lie Cys Leu Glu Leu He Lys 

20 25 30 

gaa cct gtc tec aca aag tgt gae cae ata ttt tgc aaa ttt tge atg 263 
Glu Pro Val Ser Thr Lys Cys Asp His lie Phe Cys Lys Phe Cys Met 

35 40 45 

ctg aaa ctt etc aac cag eag mb 9gg ect tea cag tgt cct tta tgt 
Leu Lys Leu Leu Asn Gin Lys Lys 6ly Pro Ser Gin Cys Pro Leu Cys 

50 55 60 

aag aat gat ata aee aaa agg age eta caa gaa agt aeg aga ttt agt 
Lys Asn Asp lie Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 
65 70 75 80 

caa ett gtt gaa gag eta ttg aaa ate att tgt get ttt cag ett gae 
Gin Leu Val Glu Glu Leu Leu Lys lie lie Cys Ala Phe Gin Leu Asp 

85 90 95 

aca ggt ttg gag tat gca aac age tat aat ttt gea aaa aag gaa aat 
Thr Cly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 

100 105 110 

aac tet cct gaa cat eta aaa gat gaa gtt tct ate ate cae agt atg 
Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser lie He Gin Ser Met 

115 120 125 

gge tac aga aac cgt gee aaa aga ett eta cag agt gaa ecc gaa aat 551 
Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 

130 135 140 

cct tec ttg eag gaa ace agt etc agt gtc caa etc tct aac ctt gga 
Pro Ser Leu Gin Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 
145 150 155 160 



311 



359 



407 



455 



503 



599 
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c«et gtg sga ^ct ctg agg aca Mg csg egg mtm cm cct cmm aag aeg 647 

Thr Val Arg Thr Leu Arg Thr Lys Gin Arg lie Gin Pro Gin Lys Thr 
165 170 175 

tct gtc tac att gaa ttg gga tct gat tct tct gaa gat tec gtt aat 695 
Ser Val Tyr He Glu Leu Gly Ser Asp Ser Ser Clu Asp Thr Val Aan 

180 185 190 

aag gca act tat tgc agt gtg gga gat eaa gaa ttg tta caa ate acc 743 
Lys Ala Thr Tyr Cys Ser Val Gly Asp Gin Glu Leu Leu Gin He Thr 

195 200 205 

cct caa gga acc agg gat gaa ate agt ttg gat tct gca aaa aag get 791 
Pro Gin Gly Thr Arg Asp Glu lie Ser Leu Asp Ser Ala Lys Lys Ala 

210 215 220 

get tgt gaa ttt tet gag aeg gat gta aca aat act gaa cat cat caa 839 
Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gin 
225 230 235 240 

ccc agt aat aat gat ttg aae ace aet gag aag cgt gca get gag agg 887 
Pro Ser Asn Asn Asp Leu Asn thr Thr Glu Lys Arg Ala Ala Glu Arg 

245 250 255 

cat eca gaa aag tat cag ggt agt tet gtt tea aae ttg cat gtg gag 935 
His Pro Glu Lys Tyr Gin Gly Ser Ser Val Ser Asn Leu His Vat Glu 

260 265 270 

eca tgt ggc aca aat aet cat gee age tea tta cag cat gag aae age 983 
Pro Cys Gly Thr Asn Thr His AU Ser Ser Leu Gin His Glu Asn Ser 

275 280 285 

agt tta tta etc aet aaa gae aga atg aat gta gaa aag get gaa tte 1031 
Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu Phe 

290 295 300 

tgt aat aaa age aaa cag cct ggc tta gca agg age caa cat aac aga 1079 
Cys Asn Lys Ser Lys Gin Pro Gly Leu Ala Arg Ser Gin His Asn Arg 
305 310 315 320 

tgg get gga agt aag gaa aca tgt aat gat agg egg act ccc age aca 1127 
Trp Ala Gly Ser Lys Glu Thr Cys Asn Asp Arg Arg Thr Pro Ser Thr 

325 330 335 

gaa aaa aag gta gat etg aat get gat ccc ctg tgt gag aga aaa gaa 1175 
Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg Lys Glu 

340 345 350 

tgg aat aag cag aaa etg eca tgc tea gag aat cct aga gat act gaa 1223 
Trp Asn Lys Gin Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 

355 360 365 

gat gtt cct tgg ata aca eta aat age age att cag aaa gtt aat gag 1271 
Asp Val Pro Trp He Thr Leu Asn Ser Ser He Gin Lys Val Asn Glu 

370 375 380 

tgg ttt tec aga agt gat gaa ctg tta ggt tct gat gac tea cat gat 1319 
Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 
385 390 395 400 

ggg gag tet gaa tea aat gee aaa gta get gat gta ttg gac gtt eta 1367 
Gly Glu Ser Glu ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 
405 410 415 
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Mt oag gxa gat gaa tat tct 99t tct tea gag "ta gac tta ctg 1415 
Asn Glu Val Asp 6lu Tyr Ser 6ly Ser Ser Glu Lys lie Asp Leu Leu 

420 425 450 

gcc agt gat cct cat gag get tta ata tgt aaa agt gaa aga gtt cac 1463 
Ala Ser Asp Pro His Glu Ala Leu He Cys Lys Ser Asp Arg Val His 

435 440 445 

tec aaa tea gta gag agt aat att gaa gac aaa ata ttt ggg aaa acc 1511 
Ser Lys Ser Val Glu Ser Asp lie Glu Asp Lys He Phe Gly Lys Thr 

450 455 460 

tat egg aag aag gca age etc ecc aac tta age cat gta act gaa aat 1559 
Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser Mis Val Thr Glu Asn 
465 470 475 480 

eta att ata gga gca ttt gtt act gag cca cag ata ata caa gag cgt 1607 
Leu He He Gly Ala Phe Val Ser Glu Pro Gin He He Gin Glu Arg 

485 490 495 

cce etc aea aat aaa tta aag cgt aaa agg aga cct aca tea ggc ett 1655 
Pro Leu Thr Asn Lys Leu Lys Aeg Lys Arg Arg Pro Thr Ser Gly Leu 

500 505 510 

eat cet gag gat ttt ate aag aaa gea gat ttg gea gtt eaa aag act 1703 
Hie Pro Glu Asp Phe H* Lys Lys Ala Asp Leu Ala Val Gin Lys Thr 

515 520 525 

cct gaa atg ata aat cag gga act aac eaa aeg gag eag aat ggt eaa 1751 
Pro Glu Net He Asn Gin Gly Thr Asn Gin Thr Glu Gin Asn Gly Gin 

530 535 540 

gtg atg aat att act aat agt ggt cat gag aat aaa aca aaa ggt gat 1799 
Val Met Asn He Thr Asn Ser Gly His Glu Asn Lys Thr Lys Gly Asp 

545 550 555 

tct att cag aat gag aaa aat cct aac eea ata gaa tea etc gaa aaa 1847 
Ser He Gin Asn Glu Lys Asn Pro Asn Pro He Glu Ser Leu Glu Lys 
560 565 570 575 

gaa tct get ttc aaa aeg aaa get gaa cct ata age age agt ata age 1895 
Glu Ser Ala Phe Lys Thr Lys Ala Glu Pro He Ser Ser Ser He Ser 

580 585 590 

aat atg gaa etc gaa tta aat ate cac aat tea aaa gca cct aaa aag 1943 
Asn Glu Leu Glu Leu Asn He Net His Asn Ser Lys Ala Pro Lys Lys 

595 6O0 605 

aat agg ctg agg an aag tct tct acc agg cat att cat gcg ett gaa 1991 
Asn Arg Leu Arg Arg Lys Ser Ser Thr Arg His He His Als Leu Glu 

610 615 620 

eta gta gtc agt aga aat eta age cca cct aat tgt act gaa ttg caa 2039 
Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gin 

625 630 635 

att gat agt tgt tct age agt gaa gag ata aag aaa aaa aag tac aac 20B7 
He Asp Ser Cys Ser Ser Ser Glu Glu He Lys Lys Lys Lys Tyr Asn 
640 645 650 655 

caa atg cca gtc agg cac age aga aac eta eaa etc atg gaa ggt aaa 2135 
Gin Net Pro Val Arg His Ser Arg Asn Leu Gin Leu Met Glu Gly Lys 
660 665 670 
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2ia3 



2231 



2279 



2327 



2375 



2423 



2471 



gaa cct oca act ooa 9cc Mg. aag. agt aac aag eca Mt gaa cag aca 
Glu pro Ala Thr Cly Ala Ly« tya Ser Aan Lys Pro Asn Glu Gin Thr 

675 680 685 

agt aaa aga cat gac age gat act ttc cca gag ctg aag tta aca aat 
Ser Lys Arg His Asp Ser Asp Thr Phe Pro Glu Leu Lys Leu Thr Asn 

690 695 TOO 

gca cct ggt tct ttt act aag tgt tea aat acc agt gaa ctt aaa gaa 
Ala Pro Cly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 

705 710 715 

ttt gtc aat cct age ctt cca aga gaa gaa aaa gaa gag aaa eta gaa 
Phe Val Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 
720 725 730 735 

•ca gtt aaa gtg tct aat aat get gaa gac ccc aaa gat etc atg tta 
Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Het Leu 

740 745 750 

agt gga gaa agg gtt ttg caa act gaa aga tct gta gag agt age agt 
Ser Gly Glu Arg Val Leu Gin Thr Glu Arg Ser Vat Glu Ser Ser Ser 

755 760 765 

att tea ttg gta cct ggt act gat tat ggc act cag gaa agt ate teg 
lie Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser lie Ser 

770 775 780 

tta ctg gaa gtt age act eta ggg aag gca aaa aca gaa cce aat aaa 2519 
Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 

785 790 795 

tgt gtg agt cag tgt gca gca ttt gaa aae ccc mmg gga eta att cat 2567 
Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu lie His 
800 805 810 815 

ggt tgt tec aaa gat aat aga aat gac aca gaa ggc ttt aag tat cca 2615 
Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Cly Phe Lys Tyr Pro 

820 825 830 

ttg gga cat gaa gtt aac cac agt egg gaa aca age ata gaa atg gaa 2663 
Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser lie Glu Met Glu 

835 840 845 

gaa agt gaa ctt gat get eag tat ttg cag aat aca ttc aag gtt tea 2711 
Glu ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser 

850 855 860 

aag cgc cag tea ttt get ccg ttt tea sat cca gga aat gea gaa gag 2759 
Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn Pro Cly Asn Ala Glu Glu 

865 870 875 

gaa tgt gca aca ttc tct gee eae tct ggg tec tta aag aaa caa agt 2807 
Glu Cys Ala Thr Phe Ser Ala His Ser Cly Ser Leu Lys Lys Gin Ser 
880 885 890 895 

cca aaa gtc act ttt gaa tgt gaa caa aag gaa gaa aat caa gga aag 2855 
Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn Gin Gly Lys 

900 905 910 

aat gag tct aat ate aag cct gta cag aca gtt aat ate act gea ggc 2903 
Asn Glu Ser Asn II Lys Pro Val Gin Thr Val Asn He Thr Ala Gly 
915 920 925 
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ttt cct 9tg gtt ggt cag aw ••t asg cca-gtt 9«t wt gcc aaa tgt 2951 
Phe Pro VBl Val Cly Cln Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 

950 935 940 

•gt ate aaa gga ggc tct agg ttt tgt eta tea tct eag ttc aga ggc 2999 
Ser He Lys Cly Cly Ser Arg Phe Cys Leu Sep Ser Gin Phe Arg Gly 

945 950 955 

aac gaa act gga etc att act cca aat aaa cat gga ctt tta caa aac 3047 
Asn Glu Thr Gly Leu He Thr Pro Asn Lys His Gly Leu Leu Gin Asn 
960 965 970 975 

cca tat cgt ata cca cca ctt ttt ccc ate aag tea ttt gtt aaa act 3095 
pro Tyr Arg He Pro Pro Leu Phe Pro He Lys Ser Phe Val Lys Thr 
980 985 990 

aaa tgt aag aaa aat ctg eta gag gaa aac ttt gag gaa cat tea atg 
Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Clu Glu His Ser Met 

995 1000 1005 

tea cct gaa aga gaa atg gga aat gag aac att cca agt aca gtg age 
Ser Pro Glu Arg Clu Het Gly Asn Glu Asn He Pro Ser Thr Val Ser 

1010 1015 1020 

«ea att age cgt aat aac att aga gaa aat gtt ttt aaa gaa gee age 
Thr He Ser Arg Asn Asn He Arg Glu Asn Val Phe Lys Glu Ala Ser 

1025 1030 1035 

tea age aat att aat gaa gta ggt tec agt act aat gaa gtg ggc tec 
Ser Ser Asn I le Asn Glu Val Gly Ser Ser Thr Asn Glu Val Gly Ser 
i040 1 045 1050 1055 

agt att aat gaa ata ggt tec agt gat gaa aac att caa gca gaa eta 
Ser He Asn Clu He Gly Ser Ser Asp Glu Asn He Gin Ala Glu Leu 

1060 1065 1070 

ggt aga aac aga ggg cca aaa ttg aat get atg ctt aga tta ggg gtt 
Gly Arg Aan Arg Gly Pro Lys Leu Asn Ala Met Leu Arg Leu Gly Val 

1075 1080 1085 

ttg caa cct gag gtc tat aaa caa agt ctt cct gga agt aat tgt aag 
Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Cly Ser Asn Cys Lys 

1090 1095 1100 

cat cct gaa ata aaa aag caa gaa tat gaa gaa gta gtt cag act gtt 
His Pro Glu He Lys Lys Gin Glu Tyr Glu Glu Val Val Gin Thr Val 

1105 1110 1115 

aat aca gat ttc tct cca tat ctg att tea gat aac tta gaa eag cct 
Asn Thr Asp Phe Ser Pro Tyr Leu He Ser Asp Asn Leu Glu Cln Pro 
1120 1125 1130 1135 

atg gga agt agt cat gca tct eag gtt tgt tct gag aca cct gat gac 
Met Gly ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 

11A0 1U5 1150 

ctg tta gat gat ggt gaa ata aag gaa gat act agt ttt get gaa aat 
Leu Leu Asp Asp Gly Glu He Lys Glu Asp Thr Ser Phe Ala Glu Asn 

1155 1160 1165 

gac att aag gaa agt tct get gtt ttt age aaa age gtc cag aaa gga 
Asp He Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 
1170 1175 1180 



3143 



3191 



3239 



3287 



3335 



3383 



3431 



3479 



3527 
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5719 



9ag ctt age agg agt ect age cct ttc ace cat act eat ttg get cag 

Glu Leu Ser Arg Ser Pro Ser Fro Phe Thr His Thr His tcu Ala Gin 

1185 1190 1195 

ggt tac cga aga ggg gee aag aaa tta gag tec tea 9M 9*9 aae tta 5767 
Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu 
1200 1205 1210 1215 

tct agt gag gat gaa gag ett eec tgc ttc caa eac ttg tta ttt ggt 3815 
Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 

1220 1225 1230 

aaa gta aac aat ata cct tct eag tct act agg eat age acc gtt get 3863 
Lys Val Asn Asn He Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 

1235 12A0 1245 

acc gag tgt ctg tct aag aac aca gag gag sat tta tta tea ttg aag 3911 
Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu Lys 

ISO 1255 1260 

aat age tta aat gac tgc agt aac cag gta ata ttg gca aag gca tct 3959 
Asn Ser Leu Asn Asp Cys Ser Asn Gin Val lie Leu Ala Lys Als Ser 

1265 1270 1275 

eag gaa cat eac ett agt gag gaa aca aaa tgt tct get age ttg ttt 4007 
Gin Glu His His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu Phe 
1280 1285 1290 1295 

tct tea cag tgc agt gaa ttg gaa gac ttg act gca aat aca aac ace 4055 
Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 

1300 1305 1310 

cag gat cct ttc ttg att ggt tct tec aaa caa atg agg cat cag tct 4103 
Gin Asp Pro Phe Leu He Gly Ser Ser Lys Gin Net Arg His Gin Ser 

1315 1320 1325 

gaa age cag gga gtt ggt ctg agt gac aag gaa ttg gtt tea gat gat 4151 
Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 

1330 1335 1340 

gaa gaa aga gga aeg ggc ttg gaa gaa aat aat caa gaa gag caa age 4199 
Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn Asn Gin Glu Glu Gin Ser 

1345 1350 1355 

atg gat tea aac tta ggt gaa gea gca tct ggg tgt gag agt gaa aca 4247 
Met Asp Ser Asn Leu Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu Thr 
1360 1365 1370 1375 

age gtc tct gaa gac tgc tea ggg eta tec tct cag agt gac att tta 4295 
Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp lie Leu 

1380 1385 1390 

acc act eag eag agg gat acc atg caa cat aac ctg ata aag etc cag 4343 
Thr Thr Gin Gin Arg Asp Thr Net Gin His Asn Leu He Lys Leu Gin 

1395 1400 1405 

cag gaa atg get gaa eta gaa get gtg tta gaa cag eat ggg age cag 4391 
Gin Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gin His Gly Ser Gin 

1410 1415 1420 

cct tct aac age tac cct tec ate ata agt gac tct tct gee ett gag 4439 
Pro Ser Asn Ser Tyr Pro Ser He He Ser Asp Ser Ser Ala Leu Glu 

1425 1430 1435 
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sac ct9 cga aat cca gaa caa age aca tea oaa aaa pea gta tta act 4487 
Asp Leu Arg Asn Pro Glu Gtn Ser Thr Ser Glu tys Val Leu Gin Thr 
1440 laS 1450 1455 

tea eag aaa agt agt gaa tac cct ata age eag aat cca gaa ggc ctt 4535 
Ser Gin Lys Ser Ser Glu Tyr Pro lie Ser Gtn Aan Pro Glu Gly Xaa 

1460 1465 1470 

tct get gac aag ttt gag gtg tet gca gat agt tet ace agt aaa aat 4583 
Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 

1475 1480 1485 

aaa gaa cca gga gtg gaa agg tee tec cct tct aaa tgc cca tea tta 4631 
Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 

1490 1495 1500 

gat gat agg tgg tac atg eac agt tge tet ggg agt ctt eag aat aga 4679 
Asp Asp Arg Trp Tyr Het Mis Ser Cys Ser Gly Ser Leu Gin Asn Arg 
1505 1510 1515 1520 

aac tae cca tct caa gag gag etc att aag gtt gtt gat gtg gag gag 4727 
Asn Tyr Pro Pro Gin Glu Glu Leu He Lys Val Val Asp Val Glu Glu 

1525 1530 1535 

caa cag ctg gaa gag tet ggg cca cae gat ttg ocg gaa mem tct toe 4775 
Gin Gin Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr Ser Tyr 

1540 1545 1550 

ttg cca agg caa gat eta gag gga ace cct tae ctg gaa tet gga ate 4823 
Leu Pro Arg Gin Asp Leu Glu Gly Thr Pro Tyr Leu Glu Ser Gly lie 

1555 1560 1565 

age etc tte tet gat gac cct gaa tet gat cct tet gaa gac aga gee 4871 
Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp Pro Ser Glu Asp Arg Ala 

1570 1575 1580 

cca gag tea get egt gtt ggc aac ata pea tct tea ace tet gca ttg 4919 
Pro Glu Ser Ala Arg Val Gly Asn He Pro Ser Ser Thr Ser Ala Leu 
15B5 1590 1595 1600 

aaa gtt ccc caa ttg aaa gtt gca gaa tct gee cag agt cca get get 4967 
Lys Val Pro Gin Leu Lys Val Ala Glu Ser Ala Gin Ser Pro Ala Ala 

1605 1610 1615 

get eat act act gat act get ggg tat aat gca atg gaa gaa agt gtg 5015 
Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Het Glu Glu Ser Val 

1620 1625 1630 

age agg gag aag cca gaa ttg aca get tea aca gaa agg gte aac aaa 5063 
Ser Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn Lys 

1635 1640 1645 

aga atg tec atg gtg gtg tet ggc ctg ace cca gaa gaa ttt atg etc 5111 
Arg Met Ser Plot Val Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu 

1650 1655 1660 

gtg tac aag ttt gee aga aaa eac cac ate act tta aet aat eta att 5159 
Val Tyr Lys Phe Ala Arg Lys His His lie Thr Leu Thr Asn Leu He 
1665 1670 1675 1680 

act gaa gag aet aet eat gtt gtt atg aaa aca gat get gag ttt gtg 5207 
Thr Glu Glu Thr Thr His Val Val Het Lys Thr Asp Ala Glu Phe Val 
1685 1690 1695 
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tgt 9aa egg aca ctg tmt tti eta gga att gcg ggt 99a aaa tgg 5Z55 
Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly lie Ala Gly Gly Lys Trp 

1700 1705 1710 

gta gtt age tat ttc tgg gtg acc cag tct att aaa gaa aga aaa atg 5303 
Val Val Ser Tyr Phe Trp Val Thr Gin Ser lie Lys Glu Arg lys Met 

1715 1720 1725 

ctg aat gag cat gat ttt gaa gtc aga gga gat gtg gtc aat gga aga 5351 
Leu Asn Glu His Asp Phe Glu Val Arg Gly Asp Val Val Asn Gly Arg 

1730 1735 1740 

aac cac caa ggt cca aag cga gca aga gaa tec cag gac aga aag ate 5399 
Asn His Gin Gly Pro Lys Arg Ala Arg Glu Ser Gin Asp Arg Lys He 
1745 1750 1755 1760 

ttc agg ggg eta gaa ate tgt tgc tat ggg cce ttc acc aac atg ecc 5447 
Phe Arg Gly Leu Glu lie Cys Cys Tyr Gly Pro Phe Thr Asn Met Pro 

1765 1770 1775 

aca gat eaa ctg gaa tgg atg gta cag ctg tgt ggt get tct gtg gtg 5495 
Thr Asp Gin Leu Glu Trp Met Val Gin Leu Cys Gly Ala Ser Val Val 

1780 1785 1790 

aag gag ctt tea tea ttc acc ctt ggc aca ggt gtc cac eca att gtg 5543 
Lys Glu Leu Ser Ser Phe Thr Leu Gly Thr Gly Val His Pro lie Val 

1795 1800 1805 

gtt gtg cag cca gat gee tgg sea gag gac aat ggc ttc cat gca att 5591 
Val Val Gin Pro Asp Ala Trp Tht Glu Asp Asn Gly Phe His Ala lie 

1810 1815 1820 

ggg cag atg tgt gag gca cct gtg gtg ace cga gag tgg gtg ttg gac 5639 
Gly Gin Met Cys Glu Ala Pro Val Val Thr Arg Glu Trp Val Leu Asp 
1825 1830 1835 1840 

agt gta gca etc tac cag tgc cag gag ctg gac acc tac ctg ata cce 5687 
Ser Val Ala Leu Tyr Gin Cys Gin Glu Leu Asp Thr Tyr Leu lie Pro 

1845 1850 1855 

cag ate cce cac age cac tac tgat 5712 
Gin lie Pro His Ser His Tyr 

1860 

(2) INFORMATION FOR SEQ ID NO:48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1237 

(B) TYPE: nucleic acid 

(Q STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA regulatory sequence 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(V) ORIGINAL SOURCE 

(A) ORGANISM: Homo sapiens sapiens 

(C) INDIVIDUAL/ISOLATE: 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: normal breast 

(H) CELL LINE: not doived from a cell line 

(I) ORGANELLE: no 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: dDNA library derived from hmnan 

(B) CLONE: obtained using published sequence 

(viii) POSmON IN GENOME: 

(A) CHROMOSOME/SEGMENT: unknown 

(B) MAP POSITION: unknown 

(C) UNITS: unknown 

(ix) FEATURE: 

(A) NAME/KEY: BRCAl promoter 

(B) LOCATION: 

(Q IDENTIFICATION METHOD: restriction enzyme digest 

(D) OTHER INFORMATION: DNA sequoice regulating gene 
encoding BRCAl protein 

(x) PUBUCATION INFORMATION: 

(A) AUTHORS: Brown et al. 

(B) TITLE: Scientific Correspondence 

(C) JOURNAL: Nature 

(D) VOLUME: 372 

(E) PAGES: 733 

(F) DATE: 22/29 DECEMBER 1994 

(K) RELEVANT RESIDUES IN SEQ ID NO: 48 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

TTCCCGGACT CTACTACCTT TACCCAGACG AGACGGTCAA GCCCTCCTCA TCGCACGCCC 60 
CCAGTTATCT GAGAAACCCC ACACCCTGGT GCCGGGTCCA GGAAGTCTCA 6CGAGCTCAC 120 
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GCCGCGCAGT CGCACTTTTA ATTTATCTGT AATTCCC6CG CTTTTCCGn GCCACGGAAA 180 

CCAA666GCT ACCGCTAACC AGCAGCCTCT CA6AATAC6A AATCAAG6TA CAATCA6AGG 240 

AAGGGAG6GA CAGAAA6ACC CAA6CGTCTC TCG666CTCT 66ATTGCCCA CCXAGTCTGC 300 

CCCCCCATGA CGTAAAAGGA AAGAGACGGA AGA66AAGAA TiaACCTGA GHCGCCGTA 360 

AA6CGCCC6C CCTCTC6CCT CTACGCTTCC ACTTGCGGa TATTACGTCA CAGTAATTGC 420 

TGTACCAAGG TCAGAATCGC CACCTGA6GC CTGAATATCA GCGTAAGATA GTGTCCAAAG 480 

CAGTCTTAAG AAGA6GTCCC ATTACCCCAC TCTTTCCCCC CTAAT66AGT CCTCCAGTTT 540 

AC6TAAATAA AAGGATT6TT 66GA6GTGCA 6GGAAAGAAC TACTATTTCC AACATGCATT 600 

GCGGAACGAA AGCCCTT66C CACACTGTTC CTTG6AAACT 6TAGTCTTAT GGAGAGGAAC 660 

ATCCAATACC AAAGCGG6CA CAATTCTCAC GGAAATCCAG TCGATAGAH 6GAGACCTCC 720 

. GCGGGCTTAT ACATGTCAAC AGTAATATTG GGTT6TTATG TTCTCCTATC TTGAGA6CAG 780 

AGACTAGGCC AAAAAAAGAT ATAGGAAGAC TAC6ATTCCC ATCCA6CCCC ACGA6TCTCG 840 

6GCAA6TAGT CCTCTAACGT CAGTGGCCTG CGGGGACGCA 6TC66C6CCG AATTTGCCT6 900 

GGGAAGGGGA AATCCCTCTC TGGTCACATC TGC6CACTCC TAGTTCCGCC CCTCAGCATC 960 

AATGTTTGTT ATTGTTGTTC GGGTTCAGGT T6CTTCTGCC CC6CCCCATC GAC6CAATCT 1020 

CCACCAATCA ATCGCGTG6T CGTTTTGAGG 6ACAAGTGCT GA6A6CCAAT CATCTTGGCG 1080 

AACACTCGCA GAAACAGGGG ACTA6TTACT GTCTTTATCC 6CCATGTTAG ATTCACCCCA 1140 

CAGGGATAGC CGCA6AGCCG GTAGCGGACG GTCCT7GCAT T6GCCTCCCG CAGGCGCCCC 1200 

CCGGGGGCGG GAAGCTGGTA AGGAAGCA6C TGCGGTT 1237 

(2) INFORMATION FOR SEQ ID NO:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 

(B) TYPE: amino acid 

(C) STRAJTOEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
(iu) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(V) ORIGINAL SOURCE 

(A) ORGANISM: Homo s^iens sapiens 

(C) INDIVIDUAL/ISOLATE: 

(D) DEVELOPMENTAL STAGE: adult 

(F) TISSUE TYPE: female breast 

(G) CELL TYPE: normal breast tissue 

(H) CELL LINE: not derived from a cell line 

(I) ORGANELLE: no 
(ix) FEATURE: 

(A) NAME/KEY: BRCAl protein 
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2 (B) LOCATION: 1 to 1863 - 

(Q IDENTIFICATION METHOD: observation of mRNA and 

antisense inhibition of BRCAl gene 
(D) OTHER INFORMATION: BRCAl protein has a negative 
regulatory effect on growth of human mammary cells. 

(x) PUBUCATION INFORMATION: 

(A) AUTHORS: Miki, Y., et. al. 

(B) TITLE: A strong candidate gene for the breast and ovarian 

cancer susceptibility gene BRCAl. 

(C) JOURNAL: Science 

(D) VOLUME: 266 

(E) PAGES: 66-71 

(F) DATE: 1994 

(K) RELEVANT RESIDUES IN SEQ ID NO: 49 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

net Asp Leu Ser Ala Leu Ars Val Glu Glu Val Gin Asn Val He Asn 

15 10 15 

Ala net Gin Lys lie Leu Clu Cys Pro lie Cys Leu Glu Leu lie Lys 

20 25 30 

Glu Pro Val Ser Thr Lys Cys Asp His He Phe Cys Lys Phe Cys Met 

35 40 45 

Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 

50 55 60 

Lys Asn Asp lie Thr Lys krg Ser Leu Gin Glu Ser Thr Arg Phe Ser 
65 70 75 80 

Gin Leu Val Glu Glu Leu Leu Lys lie lie Cys Ala Phe Gin Leu Asp 

85 90 95 

Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 

100 105 110 

Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser He tie Gin Ser Met 

115 120 125 

Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 

130 135 140 

Pro Ser Leu Gin Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 
145 150 155 160 

Thr Val Arg Thr Leu Arg Thr Lys Gin Arg He Gin Pro Gin Lys Thr 

165 170 175 

Ser Val Tyr He Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn 
180 185 190 
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Lys Ala Thr tyr Cys S*r V«l 6ly Asp 6ln Glu Leu Leu Gin He Thr 

195 2O0 205 

Pro Gin Gly Thr Arg Asp Glu Ue Scr Leu Asp Ser AU Lys Lys Ala 

210 215 220 

Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gin 
225 230 235 240 

Pro Ser Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 

245 250 25% 

His Pro Glu Lys Tyr Gin Gly Ser Ser Val Ser Asn Leu His Val Glu 

260 265 270 

Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gin His Glu Asn Ser 

275 280 285 

Ser Leu Leu Leu Thr Lys Asp Arg Wet Asn Val Glu Lys Ala Glu Phe 

290 295 300 

Cys Asn Lys Ser Lys Gin Pro Gly Leu Ala Arg Ser Gin His Asn Arg 
305 310 315 320 

Trp Ala Cly Ser Lys Glu Thr Cys Asn Asp Arg Arg Thr Pro Ser Thr 

325 330 335 

Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg Lys Glu 

340 345 350 

Trp Asn Lys Gin Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 

355 360 365 

Asp Val Pro Trp lie Thr Leu Asn Ser Ser Zle Gin Lys Val Asn Glu 

370 375 380 

Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 
385 390 395 400 

Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 

405 410 415 

Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys lie Asp Leu Leu 

420 425 430 

Ala Ser Asp Pro His Glu Ala Leu He Cys Lys Ser Asp Arg Val His 

435 440 4A5 

Ser Lys Ser Val Glu Ser Asp He Glu Asp Lys He Phe Gly Lys Thr 

450 455 460 

Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 
465 470 475 480 

Leu He He Gly Ala Phe Val Ser Glu Pro Gin He He Gin Glu Arg 

485 490 495 

Pro Leu Thr Asn Lys Leu Lys Aeg Lys Arg Arg Pro Thr Ser Gly Leu 

500 505 510 

His Pro Glu Asp Phe He Lys Lys Ala Asp Leu Ala Val Gin Lys Thr 

515 520 525 

Pro Glu Net He Asn Gin Gly Thr Asn Gin Thr Glu Gin Asn Gly Gin 

530 535 540 

Vol Met Asn He Thr Asn Ser Gly His Glu Asn Lys Thr Lys Gly Asp 

545 550 555 

Ser He Gin Asn Glu Lys Asn Pro Asn Pro Ue Glu Ser Leu Glu Lys 
560 565 570 575 
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Glu ser Ala Phe Lys Thr Lys Me Glu Pro He Ser Ser Ser He Ser 

580 585 590 

Asn Glu Leu Glu Leu Asn lie Net His Asn Ser Lys Ala Pro Lys Lys 

S95 600 605 

Asn Arg Leu Arg Arg Lys Ser Ser Thr Arg His He His Als Leu Glu 

610 615 620 

Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gin 

625 630 635 

lie Asp Ser Cys Ser Ser Ser Glu Glu He Lys Lys Lys Lys Tyr Asn 
640 645 650 655 

Gin Met Pro Val Arg His Ser Arg Asn Leu Gin Leu Net Glu Gly Lys 

660 665 670 

Glu Pro Ala Thr Gly Ala Lys Lys Ser Asn Lys Pro Asn Glu Gin Thr 

675 680 685 

Ser Lys Arg His Asp Ser Asp Thr Phe Pro Glu Leu Lys Leu Thr Asn 

690 695 700 

Ala Pro Gly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 

705 710 715 

Phe Val Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 
720 725 730 735 

Thr Vat Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 

740 745 750 

Ser Gly GLu Arg Val Leu Gin Thr Glu Arg Ser Val Glu Ser Ser Ser 

755 760 765 

He Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser He Ser 

770 775 780 

Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 

785 790 795 

Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu He His 
800 805 810 815 

Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 

820 825 830 

Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser He Glu Met Glu 

835 840 845 

Glu Ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser 

850 855 860 

Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 

865 870 875 

Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gin Ser 
880 885 890 895 

Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn Gin Gly Lys 

900 905 910 

Asn Glu Ser Asn He Lys Pro Val Gin Thr Val Asn He Thr Ala Gly 

915 920 925 

Phe Pro Val Val Gly Gin Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 

930 935 940 

Ser He Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gin Phe Arg Gty 
945 950 955 
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Asn Glu Thp Gly Leu Ue Thr Pro A«n Lys His Gly Leu Leu Gin Asn 
960 965 970 975 

Pro Tyr Arg lie Pro Pro Leu Phe Pro lie Lys Ser Phe Val Lys Thr 

980 985 990 

Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Net 

995 1000 1005 

Ser Pro Glu Arg Glu Met Gly Asn Glu Asn He Pro Ser Thr Val Ser 

1010 1015 1020 

Thr He Ser Arg Asn Asn tie Arg Glu Asn Val Phe Lys Glu Ala Ser 

1025 1050 1035 

Ser Ser Asn lie Asn Glu Val Gly Ser Ser Thr Asn Glu Val Gly Ser 
1040 1045 1050 1055 

Ser lie Asn Glu lie Gly Ser Ser Asp Glu Asn He Gin Ala Glu Leu 

1060 1065 1070 

Gly Arg Asn Arg Gly Pro Lys Leu Asn Ala Net Leu Arg Leu Gly Val 

1075 1C»0 1085 

Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 

1090 1095 1100 

His Pro Glu He Lys Lys Gin Glu Tyr Glu Glu Val Vol Gin Thr Val 

1105 1110 1115 

Asn Thr Asp Phe Ser Pro Tyr Leu He Ser Asp Asn Leu Glu Gin Pro 
1120 lis 1130 1135 

Net Gly ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 

1140 1U5 1150 

Leu Leu Asp Asp Gly Glu He Lys Glu Asp Thr Ser Phe Ala Glu Asn 

1155 1160 1165 

Asp He Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 

1170 1175 1180 

Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 

1185 1190 1195 

Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu 
1200 1205 1210 1215 

Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 

1220 1225 1230 

Lys Vol Asn Asn He Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 

1235 1240 1 245 

Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu Lys 

1250 1255 1260 

Asn Ser Leu Asn Asp Cys Ser Asn Gin Val He Leu Ala Lys Als Ser 

1265 1270 1275 

Gin Glu His His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu Phe 
1280 1285 1290 1295 

Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 

1300 1305 1310 

Gin Asp Pro Phe Leu He Gly Ser Ser Lys Gin Net Arg His Gin Ser 

1315 1320 1325 

Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 
1330 1335 13^0 
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Glu Glu Arg Gly Thr Gly Leu Glii Glu Asn Asn Gin Glu Giu GLn Ser 



Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp lie Leu 

1380 1385 1390 

Thr Thr Gin Gin Arg Asp Thr Net Gin His Asn Leu lie Lys Leu Gin 

1395 1400 1405 

Gin Glu Met Ala Glu Leu'Glu Ala Val Leu Glu Gin His Gly Ser Gin 

1410 1415 1420 

Pro Ser Asn Ser Tyr Pro Ser lie He Ser Asp Ser Ser Ale Leu Glu 

1425 1430 1435 

Asp Leu Arg Asn Pro Glu Gin Ser Thr Ser Glu Lys Val Leu Gin Thr 
1440 1445 1450 1455 

Ser Gin Lys Ser Ser Glu Tyr Pro lie Ser Gin Asn Pro Glu Gly Xaa 

1460 1465 1470 

Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 

1475 1480 1485 

Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 

1490 1495 1500 

Asp Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gin Asn Arg 
1505 1510 . 1515 1520 

Asn Tyr Pro Pro Gin Glu Glu Leu lie Lys Val Val Asp Val Glu Glu 

1525 1530 1535 

Gin Gin Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr Ser Tyr 

1540 1545 1550 

Leu Pro Arg Gin Asp Leu Glu Gly Thr Pro Tyr Leu Glu Ser Gly He 

1555 1560 1565 

Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp Pro Ser Glu Asp Arg Ala 

1570 1575 1580 

Pro Glu Ser Ala Arg Val Gly Asn lie Pro Ser Ser Thr Ser Ala Leu 
1585 1590 1595 1600 

Lys Val Pro Gin Leu Lys Val Ala Glu Ser Ala Gin Ser Pro Ala Ala 

1605 1610 1615 

Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Met Glu Glu Ser Val 

1620 1625 1630 

Ser Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn Lys 

1635 1640 1645 

Arg Met Ser Net Val Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu 

1650 1655 1660 

Val Tyr Lys Phe Ala Arg Lys His His He Thr Leu Thr Asn Leu He 
1665 1670 1675 168( 

Thr Glu Glu Thr Thr His Val Val Net Lys Thr Asp Ala Glu Phe Val 

1685 1690 1695 

Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly He Ala Gly Gly Lys Trp 

1700 1705 1710 

Val Val Ser Tyr Phe Trp Val Thr Gin Ser He Lys Glu Arg Lys Met 



Net Asp Ser 
1360 



1345 



1350 1355 
Asn Leu Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu Thr 
1365 1370 1375 



1715 



1720 



1725 
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Leu Asn Glu His Asp Phe Glu Val Arg Gly Asp Val Val Asn Gly Arg 

1730 1735 1740 

Asn His Gin Gly Pro Lys Arg Ala Arg Glu Ser Gin Asp Arg Lys He 
1745 1750 1755 1760 

Phe Arg Gly Leu Glu lie Cys Cys Tyr Gly Pro Phe Thr Asn Net Pro 

1765 1770 1775 

Thr Asp Gin Leu Glu Trp Net Val Gin Leu Cys Gly Ala Ser Val Val 

17B0 1785 1790 

Lys Glu Leu Ser Ser Phe Thr Leu Gly Thr Gly Val His Pro Ue Val 

1795 1800 1805 

Val Val Gin Pro Asp Ala Trp Tht Glu Asp Asn Gly Phe His Ala He 

1810 1815 1820 

Gly Gin net Cys Glu Ala Pro Val val Thr Arg Glu Trp Val Leu Asp 
1825 1830 1835 1840 

Ser Val Ala Leu Tyr Gin Cys Gin Glu Leu Asp Thr Tyr Leu He Pro 

1845 1850 1855 

Gin He Pro His Ser His Tyr 
1860 
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CLAIMS 

What I claim is: 

L A method for detecting differential expression of at least one marker 
gene in pre-invasive cancerous breast tissue, said method comprising the steps of: 

(a) obtaining an abnormal breast tissue sample by a collection step wherein 
said abnormal breast tissue sample comprises substantially exclusively abnormal breast 
tissue which exhibits histological or cytological characteristics of pre-invasive breast 
cancer; 

(b) isolating mRNA from said abnormal breast tissue sample; 

(c) preparing at least one abnormal breast tissue cDNA library from said 
mRNA isolated from said abnormal breast tissue sample; 

(d) obtaining a normal breast tissue sample from humans either with or 
without disease, said normal breast tissue sample comprising substantially exclusively 
normal breast tissue which does not exhibit histological or cytological characteristics 
of pre-invasive breast canc^; 

(e) preparing at least one normal breast tissue cDNA library from said 
normal breast tissue sample; and 

(f) comparing said abnormal breast tissue cDNA library with said normal 
tissue cDNA library to determine whether the expression of at least one marker gene 
in said abnormal breast tissue sample is different from the expression of said marker 
gene in said normal breast tissue sample. 

2. The method according to Claim 1 wherein said collection step is 
microscopically-directed. 

3. The method according to Claim 2 wherein the size of said abnormal 
tissue sample substantially conforms to an isolatable tissue structure such that only cells 
exhibiting abnormal cytological or histological characteristics are collected. 

4. The method according to Claim 3 wherein said isolatable tissue structure 
comprises ductal epithelial cells in pre-invasive breast cancer tissue. 

5. The method according to Claim 1 further comprising confirming said 
differential expression of said marker gene in said normal tissue sample and in said 
abnormal tissue sample by using a hybridization r PCR technique. 
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6. The methcxl according to Claim 5 wherein said hybridization technique 
comprises RT-PCR. 

7. The method according to Claim 5 wherein said hybridization technique 
comprises nuclease protection assays. 

8. The method according to Claim 5 wherein said hybridization technique 
comprises in-situ hybridization of RNA in said abnormal tissue sample and in said 
normal tissue sample. 

9. The method according to Claim 1 wherein said abnormal cDNA library 
and said normal cDNA library are compared by means of differratial display. 

10. The method according to Claim 1 wherein said abnormal cDNA library 
and said normal cDNA library are compared by means of differential screening. 

11. The method according to claim 1, wherein said normal tissue comprises 
normal breast tissue cells. 

12. The method according to claim 1, wherein said abnormal breast tissue 
cdls are non-comedo ductal carcinoma in situ cells. 

13. The method according to claim 1, wherein the primer used in the PCR 
amplification technique is selected from the group consisting of randomly selected 
primers having the sequences 

5'-C6CGACGGCC6C6CGTCTaXACGG-3\ 5'-CTTCCGCGCATACGCACAAC-3S 
5' -AACCCTCACCCTAACCCCAA-3' , 5' -CGCCCCTGCCTTACCCTCCCCGCCC-3' , 
5'-6GATGGCCTCCTCTAACCC6ACGCT-3'. 5'-ACT6GGCT6TCCTCCCGTG6CG6GG-3', 
5'-CTGAGAGGTAGCCGCGCGGAGGCTG-3', S' -GCCTGGCCGCGACACGGATTACCGC-S' , 
5 ' •TTACCGCATGCTGGACCTC6ACACG-3 > , S' -TCTGGTT ACGTCAGCCAA6CT AAT A-3 ' . 
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15. A mettod of diagnosing the presence of pre-invasive breast cancer in 
human pathologic tissues, said method comprising the steps of: 

(a) dbtaining an abnormal breast tissue sample by a collection step wherein 
said abnormal breast tissue saniple comprises substantially exclusively abnormal breast 
tissue which exhibits histological or cytological characteristics of pre-invasive breast 
canc^; 

(b) isolating mRNA from said abnormal breast tissue sample; 

(c) preparing at least one abnormal breast tissue cDNA library from said 
mRNA isolated from said abnormal breast tissue sample; 

(d) obtaining a normal breast tissue sample from hiunans either with or 
without disease, said normal breast tissue sample comprising substantially exclusively 
normal breast tissue which does not exhibit histological or cytological characteristics 
of pre-invasive breast cancer; 

(e) preparing at least one normal breast tissue cDNA library from said 
normal breast tissue sample; and 

(f) comparing said abnormal breast tissue cDNA library with said normal 
tissue cDNA library to determine whether the expression of at least one marker gene 
in said abnormal breast tissue sample is different from the expression of said marker 
gene in said normal breast tissue sample. 

(g) cloning said differentially expressed marker gene using sequence-based 
amplification to create a cloned marker gme; 

(h) sequencing said cloned marker gene; 

(i) producing proteins encoded by said cloned marker gene; 
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18. The method according to claim IS, whmin said medical diagnostic tests 
comprise blood tests. 

19. The method according to claim IS, wherein said sequence-based 
amplification employs DNA sequences designed to clone marter genes encoding 
secreted proteins. 

20. The method according to claim IS, wherein said sequence-based 
amplification employs DNA sequences designed to clone marker genes encoding 
transcription factors. 

21. The method according to claim IS, wherein said sequence-based 
amplification employs DNA sequences designed to clone marker genes racoding 
RibRed. 

22. The method according to claim IS, wherein said cloned marker genes 
encoding secreted proteins are employed in the diagnosis of specific diseases by using 
a blood test. 

23. The method according to claim IS, wherein said sequence-based 
amplification employs DNA sequences adapted to clone marker genes which encode cell 
surface prot^s. 

24. The method according to claim IS, wherein said proteins encoded by said 
cloned marker con^rise cell surface proteins and wherein the presence of said proteins 
as a diagnostic indicator is detected by using a diagnostic imaging test. 

25. A diagnostic method to determine the presence of pre-invasive breast 
cancer using detection of a differentially expressed marker gene, according to claim IS, 
wherein said diagnostic method comprises: 

a) obtaining a substantially purified marker gene which is expressed to a 
greater degree in cells collected by a microscopically-directed cloning method from 
abnormal tissue than in cells collected from normal tissue; 

b) probing tissues using a hybridization technique to determine whether said 
substantially purified marker gene is differentially expressed; and, 

c) probing nucleic acids of tissues using a standard hybidization technique 
to determine the presence of said substantiallly purified marker gene in a tissue, the 
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presCTce of the marker gene indicating the presence of non-comedo DCIS which is pre- 
invasive breast cancer* 

26. The method according to claim 25, herein said substantially purified 
marker gene has the sequence listed 

according to SEQ ID NO:l» v^ch comprises 

TTGGGAATTC C6TACGCCGG CCCCCCACTG TGCCOAATTC aCCATGCGG GGGATCCACT 60 

A6TTCAGAGC AGGCCGCCAC CCGTAGGACT CCACCTTTTG TTC6TTCCCT TTA6T6AGG6 120 

TTAATTTTC6 AGCTTGGCGT AATCATGGTC ATA6CT6TTT CCTGTGTGAA ATT6TTATCC 1B0 

GCTCACAATT CCACACAACA TACGAGCCGG AAGCATAAAA GT6TAAAGCC TGGGGTGCCT 240 

AATGAGTGAG CTAACTCACA TTAA 264 

27. The method according to claim 25, wherein said substantially purified 
marker gme has the sequence listed according to SEQ ID NO:2, which comprises 

TAGCCCGGTT ATC6AAATAC CCACAGCGCC TCTTCACTAT CA6CA6TACG CCCCCCAGTT 60 
GTACGGACAC GGA 73 

28. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:3, which comprises 

TGCCCGATGT GTGTC6TACA ACTGCC6CTG TG6CT6ATTT CGATAA 46 

29. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:4, which comprises 

TAGCCCATCA GTTC6TGTCC GTACAACTGG 6GCGCTGTGG CT6ATTTC6A TAMHWIMACC 60 
ATCAGCCCGA CC 72 

30. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:5, which comprises 

TAGCCCGGTT ATCGAAATCA GCCACAGCGC CTAACTTCTG CAGAAGCCTT TGACCATCAC 60 
CAGTTGTACG GACACGAACT CATC 84 

'31. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:6, which comprises 

GTGGTTTCCG AAATTCCTGG GAAG6GGGGT GCTGGCGTGT GGAATTGTCG CGGCCCCTGG 60 
TCT6CCGCG6 CCTTTTTT6T CTACATTCGT CGTAGCTCG 99 
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32. The method according to claim 25, wherein said substantiaUy pxirified 
marker gene has the sequence listed according to SEQ ID NO:7, which comprises 

ATCAGCCCGC GACATTCGGG TACCCGC6CC CCCCCCTCCG TCGGAATTCC TCGAGCCG6G 60 
ATCCATAGGA TCTGGAGTTA GTTTTGTT M 

33. A method for detecting diff»ential repression of at least one mark^ 
gene in pre-invasive cancerous breast tissue, said method comprising the steps of: 

(a) obtaining an abnormal tissue sample by a collection step wherein said 
abnormal tissue sample comprises substantially exclusively abnormal tissue which 
exhibits histological or cytological characteristics of pre-invasive cancer; 

(b) isolating mRNA from said abnormal tissue sample; 

(c) preparing at least one abnormal tissue cDNA library from said niEtNA 
isolated from said abnormal tissue sample; 

(d) obtaining a normal tissue sample from humans either with or without 
disease, said normal tissue sample comprising substantially exclusively normal tissue 
which does not exhibit histological or cytological characteristics of pre-invasive cancer; 

(e) preparing at least one normal tissue cDNA library from said normal 
tissue sample; and 

(f) comparing said abnormal tissue cDNA library with said normal tissue 
cDNA library to determine whether the expression of at least one marker gene in said 
abnormal tissue sample is different from the expression of said marl^ gene in said 
normal tissue sample. 

34. The method according to Claim 33 wherein said collection step is 
microscopically-directed. 

a) obtaining a substantially purified marker gene which is expressed to a 
greater degree in cells collected by a microscopically-directed cloning method firom 
abnormal tissue than in cells collected from normal tissue; 

b) probing tissues using a hybridization technique to determine whether the 
marker gene is differentially expressed; and, 

c) probing nucleic acids of tissues using a standard PCR technique to 
determine the presence of the marker gene in a tissue, the presence of the marker gene 
indicating the presence of pre-invasive cancer. 
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35. SubstantianypurifiedDNA having the nucleotides^ 

the group of seqences consisting of: SEQ ID N0:1, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7. 

36. An expression vector for the difiBsrratially e^ressed polypeptides 
mcoded by said substantially purified DNA comprising one of the group of DNA 
sequences of claim 28 operatively linked to at least one control sequence compatible 
with a suitable bacterial host cell. 

37. The vector of claim 36 wherein the DNA encoding the differentially 
expressed polypeptides encoded by said substantially purified DNA comprising one of 
the group of DNA sequoices of claim 28 is linked to at least one sequence from 
bacteriophage. 

38. Substantially purified polypeptides encoded by substantially purified DNA 
comprising one of the group of DNA sequences of claim 35 free of proteins other than 
proteins mcoded by said substantially purified DNA. 

39. An antibody specifically binding one of the group of polypeptides 
encoded by one of the nucleotide sequences selected from the group of seqraces 
consisting of: SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID 
NO:5, SEQ ID NO:6, and SEQ ID ID NO:7. 

40. A method of detecting and/or determining said antibody in a test sample, 
comprising the steps: 

(a) providing a test sample suspected of containing said marker protein; 

(b) adding a quantity of said marker protein of claim 38 to the antibody of 
claim 39; and 

(c) determining a level of said marker protein in said test sample. 

41. A method of screening compounds for activity in the treatment of breast 
cancer, comprising the steps of: 

(a) ligating a DNA sequence that regulates expression of the BRCAl gene 
into a vector, the vector having a reporter gene, so that the DNA 
sequence is located such that the DNA sequence regulates expression of 
the reporter gene; 
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(b) introducing the ligated DNA sequence/reporter gene into a breast cancer 
ceU; 

(c) administering a compound to the breast cancer cell; and 

(d) detecting levels of a protein produced by the rqx>rtOT cell. 

42. The method according to claim 41 wherein the DNA sequence is as 
essentially set forth in SEQ ID NO:48. 

43. The method according to claim 42 whwein the DNA sequence 
is selected from among: 

a- a DNA sequrace which hybridizes to SEQ ID NO:48 or fragmaits 
thereof; and 

b. DNA sequences which but for the degeneracy of the genetic code would 
hybridize to the DNA sequences defined in (a) and (b). 

44. The method according to claim 41 wherein the ligated DNA 
sequence/reporter gene is introduced into the breast cancer cell by cloning the ligated 
DNA sequence/reporter gene into an expression vector and transfecting the breast 
cancer cells with the expression vector. 

45. The method according to claim 44 wherein the DNA sequence is 
essentially set forth in SEQ ID NO:48 or its complementary strands. 

46. A method of producing an indicator compound, comprising the steps of: 

(a) ligating a DNA sequence that regulates expression of the BRCAl gene 
into a vector, the vector having a reporter gene, so that the DNA 
sequence is located such that the DNA sequence regulates expression of 
the reporter goie; 

(b) introducing the ligated DNA sequence/reporter gene into a breast cancer 
cell; 

(c) administering a biological agent to the breast cancer cell; and 

(d) producing a protein encoded by the reporter gene; and 

(e) reacting the protein encoded by the reporter gene with a compound in 
the reaction media to produce the indicator compound. 

47. The method according to claim 46 wherein the ligated DNA 
sequence/reporter gene is introduced into the breast cancer cell by cloning the ligated 
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DNA sequence/reporter gene into an expression vector and transfecting the breast 
cancer cells with the expression vector. 

48. The method according to claim 46 wherein the DNA sequence is as 
essentially set forth in SEQ ID NO:48 or its complementary strands. 

49. The method according to claim 46 wherein the DNA sequence 
is selected from among: 

a, a DNA sequmce which hybridizes to SEQ ID NO:48 or fragments 
thereof; and 

b. DNA sequences which but for the degeneracy of the genetic code would 
hybridize to the DNA sequences defined in (a) and (b). 

50. A method of treating breast cancer in a patient comprising the steps of 
ligating a gene that encodes a protein having an amino acid sequence as essentiality set 
forth in SEQ ID NO:49 with a promoter c^)able of inducing npression of the gene in 
a breast cancer cell and introducing the ligated gene into a breast cancer cell. 

51. The method of treating breast cancer described in claim 50 wherein the 
gene has a DNA sequence selected from among: 

a. the DNA sequence as essentially set forth in SEQ ID NO:47 or its 
complementary strands; 

b. a DNA sequence which hybridizes to SEQ ID NO:47 or fragments 
thereof; and 

c. DNA sequences which but for the degeneracy of the genetic code would 
hybridize to the DNA sequences defined in (a) and (b). 

52. The method of treating breast cancer described in claim 50 wherein the 
gene has a DNA sequence having 20-99% homology with SEQ ID NO:47. 

53 . The method according to claim 50 wherein the ligated gene is introduced 
into the cell in a viral expression vector. 

54. The method according to claim 50 wherein the breast cancer is gene- 
linked hereditary breast cancer. 

55. The method described in claim 50 wherein the breast cancer is sporadic 
breast cancer. 



wo 95/19369 PCrAJS95/00608 

106 

AMENDED CLAIMS 

[received by the International Bureau on 14 Oune 1995 (14.06,95); 
original claims 13 and 15 amended; new claims 14,16 and 17 added; 
remaining claims unchanged (8 pages)] 

6. The method according to Claim 5 wherein said hybridization technique 

comprises RT-PCR. 

7. The method according to Claim 5 wherein said hybridization technique 
comprises nuclease protection assays. 

8. The method according to Claim 5 wherein said hybridization technique 
comprises in-situ hybridization of RNA in said abnom^ tissue sample and in said 
normal tissue sample. 

9. The method according to Claim 1 wherdn said abnormal cDNA library 
and said normal cDNA library are compared by means of differential display. 

10. The method according to Claim 1 wherdn said abnormal cDNA library 
and said normal cDNA library are compared by means of differential screening. 

1 1 . The method according to claim 1 , wherein said normal tissue comprises 
normal breast tissue cells. 

12. The method according to claim 1, wherein said abnormal breast tissue 
cells are non-comedo ductal carcinoma in situ cells. 

13. The method according to claim 1, wherein the primer used in the PCR 
amplification technique is selected from the group consisting of randomly selected 
primers having the sequences 

5 ' - CGCGACGGCCGCGCGTCTGCCAGGG-3 \ 5 ' - CTTCCGCGCAT AC6CACAAC-3 ' , 
5 ' - AACCCTCyiCCCTAACCCCAA-3 ' , 5 ' -CCCXCCTCCGTTACCCTCCCCCCCC-3 ' , 
5 ' -GGATGGCGTCCTGTAACCCGAC6CT-3 \ 5 ' -ACTGGGCTGTCCTGCGGTGGCG6GG-3' , 
5 ' -CTGAGAGGTACCCGCGCGGAGGCTG-3' , 5 ' -CCCTGGCCGCGACACGGATTACCGC*3' , 
5 ' -TTA6CCCAT6CTGGACCTGGA6ACG-3' , 5 ' -TGT6CTTACGTCAGC6AAGGTAATA-3' , 
5' -A6TCGCAC6CATGTCACGCTCCCCC-3' , 5' -TATCCAA6CG6CACCCTAC6AGGCC-3' , 
5* -CCCGCCCCCCACGCTCTGGTATaA-3' , 5' -CTCCCTCCCCGGACTCGG6CTTAGT-3' , 
5'-ATCCGCGCG6CTCGGCCCTGGTC6C-3', 5'-CGTGAAGCCTATGCCCTCCCTCAAC-3' , 
5 ' -GTGCCCTCGTA6CCCTTCA6C6ATC-3 ' . 5 ' -GCGACACTAGGCTCCCC6AG6AGGG-3' . 
5« -TGGGCCAGGCCTCCGGGCCCGGTAT-3' , S' -CCGGAACTGCGATAGCCTCC6TCCC-3' , 
5 ' -AGCCGACACCTGTTTCCCGAGACCC-3' . 5' - AACGGGTCGACATCC6CCTGCCKC-3' , 
5' -TGAACCACGATGTCAATCGTCCCGA-3' , 5' -TCATCCCCGCCCAAAGACCCTCGCC-3' , 
5«-ATAG6CTGCGGCACGCGCTGGGACT-3^ 5'-GACCAGGTGCCCACGAGCATGTACA-3\ 
5'-ACC6TAGTCATCGCCCTTCGCGCCC-3', 5' -GGCCCCTA6CCCAGGGTGAAGCCCA-3' . 
5 ' -CCCAGTGCTACGG6CCCCCCCAAGC-3 ' , 5 ' -CCTTCCT6GGTTACCTCCCCTCCGG-3 ' , 
5' -TCCGGACAGCAGCCACGCCAAGGGC-3\ 5'-ACGCGCTGGTCCACCGAGGCCTGAT-3' , 
5' -CGATGCAAGGCCAGCAGCACTCGAC-3' , 5' -CCCCCGGAGCGGACCACCGGACCTG-3' , 
5' -AGCGGGGAGGGATCGG6GGCCAAGC-3' , 5' -GCCTGGTGTAGGCAGGCAGCTCTTA-3' , 
5' -CCACCCCTGTAGTGCGCGCTGCGAG-3' « 5' *GGAACCCGACGCCCGTCCAGGGTTC-3' , 
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5'-7C666CAGCAA6GCCGGGACGCTCC-3' , 5' -GACGGGGGAC66GCTAGGTG6CTTA-3' . 
and 5'-CTTGTTGCCG(>CGGAGA&GGCTGCC-3'. 

14. The method according to claim 2, wherein said abnormal tissue sample 
is approximately 2 mm in diameter* 

15. A method of diagnosing the presence of pre-invasive breast cancer in 
human pathologic tissues, said method comprising the steps of: 

(a) obtaining an abnormal breast tissue sample by a collection step wherein 
said abnormal breast tissue sample comprises substantially exclusively abnormal breast 
tissue which exhibits histological or cytological characteristics of pre-invasive breast 
cancer; 

(b) isolating mRNA from said abnormal breast tissue sample; 

(c) preparing at least one abnormal breast tissue cDNA library from said 
mRNA isolated from said abnormal breast tissue sample; 

(d) obtaining a normal breast tissue sample from humans either with or 
without disease, said normal breast tissue sample comprising substantially exclusively 
normal breast tissue which does not exhibit histological or cytological characteristics 
of pre-invasive breast cancer; 

(e) preparing at least one normal breast tissue cDNA library from said 
normal breast tissue sample; and 

(f) comparing said abnormal breast tissue cDNA library with said normal 
tissue cDNA library to determine whether the expression of at least one marker grae 
in said abnormal breast tissue sample is different from the expression of said marker 
gene in said normal breast tissue sample. 

(g) cloning said differentially expressed marker gene using sequence-based 
amplification to create a cloned marker gene; 

(h) sequmcing said cloned marker gene; 

(i) producing proteins encoded by said cloned marker gene; 

(j) generating antibodies which will recognize said proteins encoded by said 
cloned marker gene by antigen recognition; and 

(k) detecting said recognized antigen by means f medical diagnostic tests. 

16. The method according to claim IS, wherein said medical diagn stic tests 
comprise diagnostic tissue tests. 
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17. The metiiod according to daim 15, wherein said medical diagnostic tests 
comprise X-ray tests. 

18. The method according to claim 15, wherein said medical diagnostic tests 
comprise blood tests. 

5 19. The method according to claim 15, wherein said sequence-based 

amplification employs DNA sequraces designed to clone marker genes encoding 
secreted proteins. 

20. The method according to claim 15, wherein said sequence-based 
amplification employs DNA sequences designed to clone marker genes encoding 

10 transcription factors. 

21. The method according to claim 15, wherein said sequence-based 
amplification employs DNA sequences designed to clone marker genes encoding 
RibRed. 

22. The method according to claim 15, wherein said cloned marker genes 
15 encoding secreted proteins are employed in the diagnosis of specific diseases by using 

a blood test. 

23. The method according to claim 15, wherein said sequence-based 
amplification employs DNA sequences adapted to clone marker genes which encode cell 
surface proteins. 

20 24. The method according to claim 15, wherein said proteins encoded by said 

cloned marker comprise cell surface proteins and wherein the presence of said proteins 
as a diagnostic indicator is detected by using a diagnostic imaging test. 

25. A diagnostic method to determine the presence of pre-invasive breast 
cancer using detection of a differentially expressed marker gene, according to claim 15, 

25 wherein said diagnostic method comprises: 

a) obtaining a substantially purified marker gene which is expressed to a 
greater degree in cells collected by a microscopically-directed cloning method from 
abnormal tissue than in cells collected from normal tissue; 

b) probing tissues using a hybridization technique to determine whether said 
30 substantially purified marker gene is differentially expressed; and, 

c) probing nucleic acids of tissues using a standard hybidization technique 
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to determine the presence of said substantiallly purified marker gene in a tissue, the 
presence of the mark^ gene indicating the presence of non-comedo DCIS which is pre- 
invasive breast cancer. 

26. The method according to claim 25, wherein said substantially purified 
5 marker gene has the sequence listed according to SEQ ID NO: 1, which comprises 

TTGGGAATTG GGTACGCGGG CCCCCCACTG TGCCGAATTC CTGCATGCGG 6GGATCCACT 60 

AGTTCA&AGC AGGCCGCCAC CCGTAGGACT CCAGCTTTTG TTCGTTCCCT TTA6TGAGGG 120 

TTAATTTTC6 AGCTTGGC6T AATCATGGTC ATAGaGTTT CCTGTGTGAA ATTGTTATCC 180 

10 GCTCACAATT CCACACAACA TACGAGCCGG AAGCATAAAA GTGTAAACCC TGGGGTGCCT 240 

AATGAGTGAG CTAAaCACA TTAA 264 

27. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:2, which comprises 

15 TAGCCCGGTT ATCGAAATAG CCACAGCCCC TCTTCACTAT CAGCA6TACG CCGCCCAGTT 60 

GTACGGACAC GGA 73 

28. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO: 3, which comprises 

20 TGCCCGATGT GTGTCGTACA ACTGGCGCTG TGGCTGATTT CGATAA 46 

29. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:4, which comprises 

25 TAGCCCATGA GTTCGTGTCC CTACAACTGG 6GCGCTGTGG CTGATTTCGA TANNHNNAGC 60 

ATCAGCCCGA CG 72 

30. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO:5, which comprises 



30 



TAGCCCGGTT ATC6AAATCA CCCACACC6C CTAACTTCTG CAGAAGCCTT TGACCATCAC 60 
CAGTTGTACG GACACGAACT CATC 84 



31. The method according to claim 25, wherein said substantially purified 
35 marker gene has the sequence listed according to SEQ ID NO:6, which comprises 



GTGGTTTCCG AAATTCCTGG GAA6GGGGGT GCTGGCGTGT GGAATTGTCG CGGCCCCTGG 60 
TCTGCCGCGG CGTTHTTCT CTACATTC6T CGTAGCTCG 99 
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32. The method according to claim 25, wherein said substantially purified 
marker gene has the sequence listed according to SEQ ID NO: 7, which comprises 

ATCAGCGCfiC GACATTCGGG TACCCCCGCC CCCCCCTCC6 TCCGAATTCC TCGAGCCG6G 60 
5 ATCCATACGA TGTGGAGTTA CTTTT6T7 88 

33. A method for detecting differential expression of at least one marker 
gene in pre-invasive canc^us breast tissue, said method comprising the steps of: 

(a) obtaining an abnomial tissue sample by a collection step wherein said 
10 abnormal tissue sample comprises substantially exclusively abnormal tissue which 

^chibits histological or cytological characteristics of pre-invasive cancCT; 

(b) isolating mRNA irom said abnormal tissue sample; 

(c) preparing at least one abnormal tissue cDNA library from said mRNA 
isolated from said abnormal tissue sample; 

IS (d) obtaining a normal tissue sample from humans either with or without 

disease, said normal tissue sample comprising substantially exclusively normal tissue 
which does not exhibit histological or cytological characteristics of pre-invasive cancer; 

(e) prq>aring at least one normal tissue cDNA library from said normal 
tissue simple; and 

20 (f) comparing said abnormal tissue cDNA library with said normal tissue 

cDNA library to determine whether the expression of at least one marker gene in said 
abnormal tissue sample is different from the expression of said marker gene in said 
normal tissue sample. 

34. The method according to Claim 33 wherein said collection step is 
25 microscopically-directed . 

a) obtaining a substantially purified marker gene which is expressed to a 
greater degree in cells collected by a microscopically-directed cloning method from 
abnormal tissue than in cells collected from normal tissue; 

b) probing tissues using a hybridization technique to determine whether the 
30 marker gene is differentially expressed; and, 

c) probing nucleic acids of tissues using a standard PCR technique to 
determine the presence of the marker gene in a tissue, the presence of the marker gene 
indicating the presence of pre-invasive cancer. 
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35 . Substantially purified DN A having the nucleotide sequences selected from 
the group of seqences consisting of: SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7. 

36. An expression vector for the differentially expressed polypeptides 
racoded by said substantially purified DNA comprising one of the group of DNA 
sequences of claim 28 operatively linked to at least one control sequence compatible 
with a suitable bacterial host cell. 

37. The vector of claim 36 wherein the DNA oicoding the differentially 
expressed polypeptides encoded by said substantially purified DNA comprising one of 
the group of DNA sequences of claim 28 is linked to at least one sequence from 
bacteriophage. 

38 . Substantially purified polypeptides encoded by substantially purified DNA 
comprising one of the group of DNA sequences of claim 35 free of proteins other than 
proteins encoded by said substantially purified DNA. 

39. An antibody specifically binding one of the group of polypeptides 
encoded by one of the nucleotide sequences selected from the group of seqences 
consisting of: SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID 
NO:5, SEQ ID NO:6, and SEQ ID ID NO:7. 

40. A method of detecting and/or detmnining said antibody in a test sample, 
comprising the steps: 

(a) providing a test sample suspected of containing said marker protein; 

(b) adding a quantity of said marker protein of claim 38 to the antibody of 
claim 39; and 

(c) determining a level of said marker protein in said test sample. 

41. A method of screening compounds for activity in the treatment of breast 
cancer, comprising the steps of: 

(a) ligating a DNA sequence that regulates expression of the BRCAl gene 
into a vector, the vector having a reporter gene, so that the DNA 
sequence is located such that the DNA sequence regulates expression of 
the reporter gene; 

(b) introducing the ligated DNA sequence/reporter gene into a breast cancer 
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cell; 

(c) administering a compound to the breast cancer cell; and 

(d) detecting levels of a protein produced by the reporter cell. 

42. The method according to claim 41 wherein the DNA sequence is as 
essentially set forth in SEQ ID NO:48. 

43. The method according to claim 42 wherein the DNA sequence 
is selected from among: 

a. a DNA sequence which hybridizes to SEQ ID NO:48 or fragments 
thereof; and 

b. DNA sequences whidi but for the degeneracy of the genetic code would 
hybridize to the DNA sequences defined in (a) and (b). 

44. The method according to claim 41 wherein the ligated DNA 
sequence/reporter gene is introduced into the breast cancer cell by cloning the ligated 
DNA sequence/reporter gene into an expression vector and transfecting the breast 
cancer cells with the expression vector. 

45. The method according to claim 44 wherein the DNA sequence is 
essmtially set forth in SEQ ID NO:48 or its complementary strands. 

46. A method of producing an indicator compound, comprising the steps of: 

(a) ligating a DNA sequence that regulates expression of the BRCAl gene 
into a vector, the vector having a reporter gene, so that the DNA 
sequence is located such that the DNA sequence regulates expression of 
the reporter gene; 

(b) introducing the ligated DNA sequence/reporter gene into a breast cancer 
cell; 

(c) administering a biological agent to the breast cancer cell; and 

(d) producing a protein encoded by the reporter gene; and 

(e) reacting the protein encoded by the reporter gene with a compound in 
the reaction media to produce the indicator compound. 

47. The method according to claim 46 wherein the ligated DNA 
sequence/reporter gene is introduced into the breast cancer cell by cloning the ligated 
DNA sequence/reporter gene into an expression vector and transfecting the breast 
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cancer cells with the expression vector. 

48. The method according to claim 46 wherein the DNA sequence is as 
essentially set forth in SEQ ID NO:48 or its complementary strands* 

49. The method according to claim 46 wherrin the DNA sequence 
is selected from among: 

a. a DNA sequence which hybridizes to SEQ ID NO:48 or fragments 
thereof; and 

b. DNA sequences which but for the degwieracy of the genetic code would 
hybridize to the DNA sequaices defined in (a) and (b). 

50. A method of treating breast cancer in a patirat comprising the steps of 
ligating a gene that racodes a protein having an amino acid sequence as essmtiallly set 
forth in SEQ ID NO:49 with a promoter capable of inducing expression of the gene in 
a breast cancer cell and introducing the ligated gene into a breast cancer cell. 

51. The method of treating breast cancer described in claim 50 wherein the 
gene has a DNA sequence selected from among: 

a. the DNA sequence as essentially set forth in SEQ ID NO:47 or its 
complementary strands; 

b. a DNA sequence which hybridizes to SEQ ID NO:47 or fragments 
thereof; and 

c. DNA sequences which but for the degeneracy of the genetic code would 
hybridize to the DNA sequences defined in (a) and (b). 

52. The method of treating breast cancer described in claim 50 wherein the 
gene has a DNA sequence having 20-99% homology with SEQ ID NO:47. 

53. The method according to claim 50 wherein the ligated gene is introduced 
into the cell in a viral expression vector. 

54. The method according to claim 50 wherein the breast cancer is gene- 
linked hereditary breast cancer. 

55. The method described in claim 50 wherein the breast cancer is sporadic 
breast cancer. 
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STATEMENT UNDER ARTICLE 19 



Pursuant to Article 19 of the Patent Cooperation Treaty and Rule 46, Applicant 
respectfully submits the attached sheets of amended claims. The sheets are replacement 
sheets for pages 98-105 of the above referenced Intemational application. These sheets 
contain Claims 6-55 of the above referenced international application. New Claims 14, 
16 and 17 have been added to replacement pages 99-100. Additional primers have been 
listed in Claim 13 on rq>lacemOTt pages 98-99. These primers are described in the 
Sequence Listing. Claim 15 has been amended to include steps (j) and (k) on 
Tq>lacement page 99. The new claims and the amended claims do not go beyond the 
scope of the application as filed. The remaining replacement sheets include no 
amendments, but are filed to maintain the correct numbering of the claim pages. 



wo WI9369 



PCT/US95/00608 



1/19 



cd 
c 



o 
c 



CO 

oo 



ON 



CO 
OX) 

CO 



r4 
oo 



Oi) 
CO 



c 

CO 

c 

E 
o 

x: 



CO 
> 



CO 

Pi 



CO 



O 
O 



•c 
"S 

CO 

S 

CJ 

c 

■s 

1 

c 

cxO 



To ^ 
E 



CO 

U 



CO 

a. 



o 

"co 
o 

'CL 



o 
o 
o 
o 

V 



2 
o 



>5 < 



CO 



3 

o 



CO 
c 

s 



o 
o 
o 
o 



V 



CO 

c 
o 

*5b 



CO 

U 
Q 
o 

4.) 

e 

o 

o 

o 
Z 



to 

CO 



E 

3 



4> 

c 
o 



CO 

> 



o 
u 



ON 



a; 



CO 

o 

to 

a 



(O 

H 



S .Si 



WO»S/19369 



PCTAJS95/00608 



2/19 



Pig. 2 




Time 



SUBSTITUTE SHEET (RULE 26j 




SUBSTITUTE SHEET (RULE 26) 



wo 95/19369 



PCT/US95/0060S 



5/19 

Fig.. 3 




SUBSTITUTE SHEET (RULE 26) 



W09S/19369 



PCr/US95/00608 



6/19 



Human TTCTCCTGACCACTAATGGGAGCCAATTCACAATTCAC 

III] i I i I I II II 11 I II ill i I II I 
Hamster TTGTGTTGAGCACTGATGGCAGGTAATGAA-AATQC— . 

Human TAAGTGAGTAAAGTAAGTTAAAGTTGTGTAGACTAAGCAT 

■I i I I I III I I I I II I I 1 I I I I I M 

Hamster -AAGTGACTCAG — AAGTTA GTGTT AGCAT 



DCis-1 GGGGGATCCACTAGTrC-AGAGGAGGCCGCCAGCCG 

I I I I I I I I I I I I I I II I 1 I I I I i I I I M I I I 
Hamster GGGGGATCCAGTAGTTCTAGAGGGG-GCGCCACCGC 

DCis-1 TAGGACTCGAGC7TTTGTTCCCTCTAGTGAAGGGTTAA 

I i II I I I I I I III I II I Ml I i I I i I I I I I I i I 
Hamster TGGAGCTCGAGCTnTGTTCCCTTTAGTGA-GGGTTAA 



Figure 6: Comparison of the sequence between DCIS-1 and the human and hamster 
genes. 
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Figure 8: Table of the Genetic Code. 
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SEQ ID NO: 1: (DCIS-1) 

TTCGGAATTG CGTACGCGGC CCCCCCACTG TCCCCAATTC CTCCATCCCG GGGATCCACT 60 

AGTTCAGAGC CCGTAGGACT CCAGCTTTTG TTCCTTCCCT TTAGTGACGG TTAATTfTCC 120 

AGCTTGGCGT AATCATGGTC ATCCTGTCTG AAATTCTTAT CCGCTCACAA TTCCACACAA 180 

CATACCAGCC CCAACCATAA AAGTGTAAGC AATCA6TGAC CTAACTCACA TTAA 234 

SEQ ID NO: 2: (DCIS-2) 

TAGCCCCGTT ATCGAAATAG CCACAGCGCC TCTTCACTAT CACCAGTACC CCCCCCAGTT 60 
GTA CGG ACA CGC 72 



SEQ ID HO: 3: (OCIS-3) 

TGCCCCATGA CTTGTCTCGT ACAACTG6C6 CTGTGGCTGA TTTCCATAA 49 
5E0 ID NO: 4: (0ClS-4> 

TACCCCATCA CTTCGTGTCC CTACAACTCG CGCCCTGTCG CTGATTTCGA TAMNNHHAGC 60 
ATCACCCCGA CG 72 



S£0 ID NO: 5: (OCIS-5) 

TAGCCCCGTT ATCCAAATCA CCCACAGCGC CTAACTTCTG CAGAACCCTT TGACCATCAC 60 
CACTTGTACC CAAACGAAC7 CATC 64 



SEQ 10 NO: 6: (DCIS*6) 

GTCCTTTCCG AAATTCCTG CGAAGGGCGG TGCTGGCGTG TGGAATTG7C GCGGCCCCTG 60 
CTCTCCCGCC GCGTTTTTT GTCTACATTC GTCGTAGCTC G 101 



SEQ 10 NO: 7: (DCIS-7) 

ATCACCCCGC GACATTCGCG TACCCGCGCC C*****TCCC TCGGAATTCC TCGAGCCGGG 60 
AT**ATAGCA TCTCCAGTTA CTTTTGTT 88 



Figure 9: Table of Differentially Expressed Marker Genes From Pre-Invasive Human 
Breast Tissue 
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